I wonder why translations always tend to need an extra format or framework. Is replacing or customizing translation files after deployment a thing or could it be just (generated) compile time CODE in the used language based on a simple common format (like gettext with macros)?
I did A LOT of translation work and in the end the best, easiest and fastest solution always was a list / map of key value pairs, where key was a unique translation identifier and the value was the translation in one language optionally having placeholders (e.g. for numbers) combined with a simple macro (best case) / function / method similar to sprintf.
Even plural forms should get their own unique identifier - automatic pluralization always failed for my use cases (russian has several different plural forms depending on the context).
Quite so, and in the case of resx, that's exactly what you're getting. A list of key-value-pairs, no support for pluralisation either (as you say, it gets hard. I understand Polish for example is very complex, perhaps Russian is the same?).
I never had a problem with resx and .net satellite assemblies and all that as far as the format goes. But it's always been an issue how to involved translators in a way that's both simple for the translators, safe for the quality and as automated as possible in bringing the translations back to the app.
That may be the case, but resx is a quite bloated XML format for a simple key value pair listing. Besides that, resx is yet another format for the same thing.
I think the solution is quite simple:
- One unified key value pair format (for translators and GUI tools)
- One intermediate format that is programming language specific (it could be generated code or highly integrated formats like resx)
- A simple tool that can transliterate between those two formats
Workflow example:
- Export a unified format file from resx with placeholders for the translations
- Translators: Here you go, use your GUI tools on this
- Get back the translated unified format
- Import a unified format file to resx
Well, that's what the setup I have does, almost. If you consider .po/.pot to be a unified format file. I extract the original texts, comments and internal names from the resx into a .pot (Portable Object Template). This is then sent to a translator-centric web site service. A translated file is then exported, although I export it as a .po file, and then use a library and a little bit of my own code to implement an IStringLocalizer. As someone said elsewhere, more and more services do support .resx directly, so I could consider skipping the .po handling in my code, and just use the .resx.
As for bloat, I find the .po format to be quite bloated, with it's use of the full texts as keys in each and every translation. I don't really like that, but in practice it appears to be working well and has been for many years. Then again, the obvious choice today would be json.
It still frustrates me that there's no built in gettext-style affordances for C#. I guess it doesn't fit the way people typically build software in .NET. IMO the result is that it's more work to localize your software, so a lot of software that should get localized doesn't.
I built a gettext-style system for my own software, so I feel like it's possible to do. But the gap between that and making something everyone can use is pretty big.
There's nothing wrong with the .NET resource system with culture-specific satellite assemblies etc as such. The bigger problem I see is lack of tooling and services for translation by non-developers, volunteers, professional translation bureaus etc. You can't expect them to fire up Visual Studio, clone your repo and start editing resx files. Especially as there's not even a translation view! With .po files, there's a ton of editors, online services, translators etc available.
With Windows Forms, at least back in the day, you could open a Form in the designer, change its language to something else and just edit text on the controls (or images, etc.). The changes would go into a resx file for the selected language. At first glance this looked quite cool, but on the other hand, translators now can ruin your UI, as there's not really a dedicated translation view that only allows to change resources ...
At a previous job we utilized https://webtranslateit.com/ - it organized everything easily for volunteers to provide translations for all resx strings.
Nice, but perhaps a little pricy for small startups. That's actually one of the things I didn't emphasize perhaps. Not only is the gettext/.po eco system big, there's also a lot of free or relatively low cost services available.
It's even more a mess when doing multi-platform. I've a Monogame Android prototype build of my game and somehow the .NET localization throws me exception at runtime. So it's easier to implement a service that load and handle localization than relying on supposed built-in mechanism.
You basically annotate all your strings with a "my string".t() or t($"my interpolated string: {var}") and then use their CLI to extract the strings to be translated. It even includes google translation API support for you to kickstart the process.
Nice, very ambitious! I like the twist to extract from the compiled IL code, much easier, more stable and reliable than parsing the source. My one gripe here is that the code does not follow the .NET paradigm of using resources at all. Still, very clever and a lot of functionality.
We came up with our own localization framework back in the .NET 2.0 days and we're still using it. Version-numbered translations in an XML file with a simple GUI to show missing translations, flag out-dated translations, etc.
XML is a drag, but JSON isn't supported without a dependency under .NET Framework and the single-binary NLTUI app masks that from most users. XML also gives nice schema validation, for example look at how simple the validation is in the github action ci for the real-world app example from above: https://github.com/neosmart/EasyBCD-Localization/blob/master...
Thanks for all the feedback! The real issue that is not handled by several suggested alternatives is how to actually manage the translation process. resx files as such work fine - but... the problem is getting them from non-technical translators, possibly with their own tool preferences, or volunteers not having any tools or particular tech skills at all. That's what I'm trying to find a better way for, while still not breaking .NET practices.
Considering resx has been supported for decades, why not just convert the resx into whatever the translator team wants?
Reading/writing resx is 10 lines of code or something. I just dump ours into an excelsheet with one column per language, because for whatever reason the translators wants it in excel like that. Then when they are done I convert it back to resx files.
In my case there's no "translator team" as such. It's a mix of volunteers, part-time paid translators, colleagues etc. I've never had a translator team, it's always various ad-hoc situations and networks of people. And I don't want to write and maintain conversion software and such workflows.
You can see my top-level comment for more info, but we wrote a very quick and dirty UI to manage the XML localizations that allows "forking" a language into a new localization and showing outdated or missing localization strings. I don't think a single person that contributed a translation was actually a developer, but I've had no complaints about it being difficult to follow in the going-on-twenty-years since its release!
Nice - but very special purpose from what I can gather. I've done a few custom solutions myself over the years, but that's exactly what I'm trying to get away from. I want to write code that does cool stuff for the users of my apps, not code that does cool stuff to make translations possible - someone else can do that cool stuff ;-) where I'm the "user".
Thanks. Not actually really special purpose. One-click import from any SWF project, though no easy WPF/XAML sdk for lack of motivation. The GUI is fully app agnostic.
Ok, maybe I was to quick to judge, I didn't spend enough time studying it. I looked at some screen shots, and it seemed like the translation tool knew things about the app, but I was apparently mistaken. Sorry.
The tool loads a directory of xml files; each file is rendered as a tab in the translation GUI. The framework exports the strings from each SWF form as a separate xml file into a directory that matches the native locale’s identifier (eg en-US) and gives it a friendly name based off the form’s title/caption and/or file name. So the end result is a naturally user-friendly approach to translation instead of having all the apps strings dumped together into one PO file as is the norm, and translators can reference the app they are translating’s UI as they methodically translate a component at a time.
I'm not sure if I'm missing something here, but many translation solutions support ResX files out of the box. For example, I use memoQ, which has built-in support for ResX files. Additionally, open-source solutions like OmegaT can handle ResX files as well.
Yes, resx support is finally becoming available in more solutions, but it's still not always a first class citizen. But perhaps that's the long-term solution, wait for resx to take over. Although it still doesn't support plural forms (then again my current setup doesn't either...).
gettext/PO is just as deficient at modern plural forms support as ResX, out of the box. The only format that I'm aware of that is specifically built to include it as first-class is Mozilla's Fluent. The same workarounds generally apply to ResX as to PO: create a number of separate strings and do a bit of math up front to choose the right one, or use a formatter on top of PO/ResX that supports something like ICU MessageFormat.
I remembered the last time I checked, the code page does not even follow the standards (maybe the Shift JIS). Some of the characters are missing in that character set.
I've home grown a solution for this a couple times... directories of yaml files, where the default is en-US (for those I've worked on). The main set(s) of yaml files are built to path.filename as a prefix to the internal structure in the file... then typescript type(s) are generated from that... with other files then built and assigned the default type as the type. Then checks for errors are run. From there, code generation for server-side usage is done.
If you're referring to my app, Xecrets Ez ( https://www.axantum.com/ ), it runs in Windows, macOS and Linux.
As mentioned, the issue I'm trying to solve is not the code end. Resx works fine once it's there. It's the translator end. How to present the texts and translations and context etc to the human, often non-technical, translators and often many and one translator might only translate a few strings, then another one etc. So it has to be real easy to use and gain access to. Can't for example ask them to install a piece of software. Finally, once a text has been translated, how to get it back to the app as easy and preferably as automated as possible.
Thanks. Yes, I looked at AvaloniaUI, Uno, Xwt, and MAUI before finally deciding on AvaloniaUI. It's not been entirely frictionless, there are some glaring omissions, the documentation is not great and the learning threshold if you're not really good with WPF is pretty high. Still, it does work, and I have the app running on Windows, Linux and macOS with very, very little platform specific code.
I did A LOT of translation work and in the end the best, easiest and fastest solution always was a list / map of key value pairs, where key was a unique translation identifier and the value was the translation in one language optionally having placeholders (e.g. for numbers) combined with a simple macro (best case) / function / method similar to sprintf.
Even plural forms should get their own unique identifier - automatic pluralization always failed for my use cases (russian has several different plural forms depending on the context).