A common library for internationalization and localization ( i18n and l10n) is GNU gettext, which offers an ecosystem of tools for maintaining databases of translations for user-interface strings, along with software APIs for C, Python, C#, and more.
We’ve had a package Gettext.jl for a while now to access this functionality from Julia, but it relied upon the Python implementation, which is unsatisfactory because it pulled in a huge PyCall.jl dependency for any package that wanted to use it.
Without knowing too much about the original package, the easiest workflow for me has been:
accessing the text with some sort of uniform identifiers. Like “EN_Name”, “FR_Name”, etc. for different languages. The language I’m running is a global variable that is “EN” or “FR” for example, then I just interpolate the strings to get the proper name. That’s a simplified version of what I do, but the general idea. For me, the most important part is just being able to establish the language, and without any knowledge of it to always retrieve the proper translation when text is retrieved. Not sure if that helps but those are just my thoughts
The basic way it works is that you mark user-visible strings that could use translations, e.g. "A random event." with
_"A random event."
where _"..." is a string macro provided by the package (and there are also some more explicit functions). Then, gettext looks the string up at runtime using the current locale’s language (as specified by the OS), and returns the translation if available (or the original string if not).
Separately, translators can create a .po file which gives translations. e.g. you might have a .po file for French that includes a translation for the above string:
msgid "A random event."
msgstr "Un événement aléatoire."
The whole machinery of .po files is set up by GNU Gettext, and there are various tools to help you write them. They are designed so that translations can be contributed by non-programmers. There are even scripts to auto-generate .po files using Google translate or using LLMs. This kind of tooling is a big advantage of using a well-established solution for software localization.
Gettext.jl just hooks into this pre-existing translation machinery. The only question is how best to express the API in Julia.
This looks great! The one thing I find a bit odd at a glance is the choice of _"..." string macros. _ doesn’t really indicate anything about the nature of the macro. Even with short forms like r"..." (regex) and v"..." (version) you can make decent guesses. Arguably rx"..." and ver"..." are clearer options yet.
Is there a particular reason you didn’t go for i18n"..."/text"..."/getx"...", or even just tx"..."/T"..."/I"..."/i"..."?
As @ericphanson said, this comes from the standard Gettext syntax _("...") that has been used for 30 years in C and other languages — the basic motivation was to minimize visual code clutter introduced by internationalizing your code, making it as tempting as possible to use extensively. We didn’t want to second-guess this well-established choice.
(Realize that, in UI-heavy code, you could have a lot of _"..." strings, making every character precious.)
Of course, you can always do
using Gettext: @__str as @i18_str
if you want another prefix, but I would tend to encourage people to stick with the standard idiom.
This makes more sense now. I do still wonder if this is one of the things done in C that we want to copy… but this isn’t a domain I’ve done much in. I am very much pro short but indicative names though, even if it’s just tr"..." or I"..." etc.
Note that I"..." is already used in at least 3 packages (Tk.jl, Strs.jl, IntervalArithmetic.jl), and tr"..." is used in one other package. Another nice thing about _"..." is that no one else will have the gall to use it unless they are following a 35-year convention adopted by vast amounts of i18n code in C/C++, Python, Ruby, and other languages.
Gettext already has a convention for this: N_"..." is a no-op translation, used to mark strings that should intentionally be left untranslated (e.g. to exclude them from automated translation/string-extraction tools).
Since adding a @_ is not breaking, we could decide to do this later if it turns out to be useful in practice.
Now that I think of it, I think this is a chunk of why I’m a bit uncomfortable with _"..." besides the unfamiliarity. I already think of a lone _ as being a bit special in an function argument/destructuring sense in Julia.