Does anyone know how to get en_US.UTF-8 sorting with diaeresis and stuff while still keeping capitalized stuff up front? I use LC_COLLATE=C to get this, but then diaeresis don't sort correctly.
I tried to hack my way around fixing this a few years ago, but after many hours gave up.
To do that you would have to define a custom locale in /usr/share/i18n/locales and define a custom LC_COLLATE section. See https://man7.org/linux/man-pages/man5/locale.5.html You can read /usr/share/i18n/locales/iso14651_t1_common as a (very) complex example.
But, full disclosure: I have never done it myself.
I tried to hack my way around fixing this a few years ago, but after many hours gave up.