Non-English packages in the General registry

Currently, the General registry has no guidelines regarding natural languages, other than that registered packages should be useful to others […, not only] for a closed group.

Given that the General registry addresses a wide international audience, it makes sense to me that packages should use “international English” both in their code (comments) and their documentation as the “lingua franca” of computing. Very occasionally, we get registrations for non-English packages, such as New package: RadarDOU v1.0.3 · Pull Request #155076 · JuliaRegistries/General.

I completely appreciate the usefulness of “localized” software in many contexts. In education at the pre-university level, having to work in a non-native language can be a significant barrier. Also, a package might have a “local” focus, such as the submission cited above that interfaces with a database specific to Brazil, and thus sensibly involves Portuguese.

The question is how we should treat such packages in the General registry. Portuguese, as a Western language, is probably something that English speakers could manage to interface with. But an API with function names that is entirely in Chinese or Hindi (non-latin scripts), while technically possible due to Julia’s unicode support, would be a considerable barrier to a wide audience.

It would make sense to me to explicitly exclude such packages from General, but to recommend alternative registries that aim at different natural languages. I don’t think we have any such registries, certainly none that are “official”, but I wonder what people think about this, or if there are any organizations who might be interested in taking up something like “national” Julia registries.

What I think should definitely be fine is multilingual packages, as I suggested in a PR to the package in question. At the very least, packages should probably have a README in English, while keeping any non-English versions as a secondary resource for non-native English speakers. Current coding agents and LLMs make such translations relatively easy to maintain. Not that automatic translation is entirely without pitfalls. Clearly, it’s preferably for a qualified human to have a hand in the translation. But, realistically, LLMs are getting pretty good at translation. (Of course, the general LLM guidelines still apply).

There are also possibilities for a full bilingual API, by exploiting Julia’s ability to have keyword arguments refer to each other. For example, what I suggested in the PR is an alias search for the function buscar

function buscar(
    client::RadarDOUClient;
    query::Union{String, Nothing} = nothing,
    date_from::Union{String, Nothing} = nothing,
    date_to::Union{String, Nothing} = nothing,
    section::Union{String, Nothing} = nothing,
    secao::Union{String, Nothing} = section,
    type::Union{String, Nothing} = nothing,
    tipo::Union{String, Nothing} = type,
    page::Int = 1,
    limit::Int = 20
)
    if isnothing(query) && isnothing(date_from) && isnothing(date_to) &&
       isnothing(secao) && isnothing(tipo)
        throw(RadarDOUError(
            "Pelo menos um filtro e obrigatorio: query, date_from, date_to, secao ou tipo.",
            "FILTER_REQUIRED",
            nothing
        ))
    end

    q = Dict{String, Any}("page" => page, "limit" => min(limit, 100))
    !isnothing(query)     && (q["query"]     = query)
    !isnothing(date_from) && (q["date_from"] = date_from)
    !isnothing(date_to)   && (q["date_to"]   = date_to)
    !isnothing(secao)     && (q["secao"]     = secao)
    !isnothing(tipo)      && (q["tipo"]      = tipo)

    _request(client, :GET, "/publications"; query = q)
end


"""
    search(client; query=nothing, date_from=nothing, date_to=nothing, section=nothing, type=nothing, page=1, limit=20)

Search publications in the DOU (Diario Oficial da Uniao). At least one filter is required.

# Example
```julia
search(client; date_from="2026-05-01", limit=10)
search(client; query="licitacao", date_from="2026-05-01")
search(client; query="edital", section="DO3", type="Edital",
        date_from="2026-01-01", date_to="2026-05-08")
```
"""
function search(client::RadarDOUClient; kwargs...)
    buscar(client; kwargs...)
end

Note each version documenting its own localized keyword arguments, while actually accepting the full range of bilingual arguments.

This trick also works very well for the somewhat related use case of having “mathy” unicode APIs with an ASCII-fallback, following the usual style guide recommendation that public APIs should not have mandatory unicode:

function f(; eta=1.0, η=eta)
  # ...
end

Are there other techniques or creative ideas that might be useful for maintaining multi-lingual packages?

Fully support the idea of dedicated registries for specific languages that are not English.

I disagree :slight_smile:
I think these national registries would be too small to be really used, and having a different registry would complicate the users experience.

I don’t think there isn’t anything wrong to have localized API for regional specific task in general, it would depend on the context, but I would let this choice to be left to the package maintenar. Also considering that inclusion in the registry is not intended as a quality badge.
I think the “readme in English” rule would be enough.

Difficult to predict the size. And usability shouldn’t be a function of it.

It would. That is why package authors would need to think twice before writing a package in a specific language that excludes the vast majority of the community.

I see this policy as an incentive for authors to think deeply about when it makes sense to target a sub-population. It also saves the General registry maintainers from reviewing code in a language that they don’t know.

Worst-case scenario: people/bots write malicious software in a different language just to get it accepted in the General registry of the language.

Not sure if this is an issue but just throwing it out there. There’s this bit in the General registry’s requirements:

In this registry, your package cannot depend on other packages that are unregistered. In addition, your package cannot depend on an unregistered version of an otherwise registered package. Both of these scenarios would cause this registry to be unreproducible.

Which I never entirely understood because it seemed to imply just about any local registry is unreproducible because most dependencies would be in General; I just guessed that having both registries added is enough. But if there are going to be separate central repositories for any reason, not just language, this shouldn’t be left up to a guess.

Well, not exactly. The General registry goes to great lengths to guarantee reproducibility. It is backed by a package server that caches packages, so that packages are still installable even if the original GitHub repo is deleted. It also does not allow to delete registered packages, or to alter the tree-hash of a release. These are matters of policy. Any arbitrary LocalRegistry someone might have does not necessarily make these guarantees. Having a “read-only” policy is relatively easy. Adding a package server would require actual infrastructure and thus is probably much less realistic. Thus, to the extent that a hypothetical “national registry” would not be backed by a package server, it wouldn’t guarantee quite the same level of reproducibility as the General registry. However, if it did, then the combination of General and that registry would be just as “reproducible” in its entirety.

If someone makes a cool open source package that is not in English it would be annoying if I would not be allowed to use it as a dependency in a registered package because they have only registered it in some regional registry.

I disagree with the proposal to mandate english in General. that seems rather non-inclusive. if there is a lack of reviewers able to oversee the registration of packages in certain languages maybe we can, as a community, seek more proactively for Julia users in that language willing to volunteer?

I might argue that if it “cool” enough to be a dependency of a registered package, it should be translated first. As a user of your package, I might very well want to “look under the hood” as a part of debugging. If that brings me to a dependency that’s incomprehensible because all function names are in non-English and all docstrings in Chinese, I might very well consider that a problem. So I think that would be intentional. The goal of the General registry, after all, is to foster a cohesive and reliable ecosystem, and it is not intended as a “free-for-all”.

(I’m not saying that’s my fully formed opinion, but it’s definitely part of why I might be concerned about non-English packages)

Some weaknesses of language-specific registries that your comment got me thinking of:

  • General is ironically not a great English-specific registry when there are dependencies in other languages. Even if we assume people will write packages in English for general use because it’s the lingua franca of Julia and programming in practice, that doesn’t cover a third-party package for English aliases, which can go a long way in dealing with the language barrier we’re trying to address. The most General allows is translated forks.
  • Let’s say there’s multiple packages like RadarDOU that deal with databases specific to written language groups, and a package reasonably tries to unify them under one multilingual API (aliases, forwarded methods, distinct docstrings, various i18n efforts). It’s not clear where multilingual packages with no primary language should live, and the post currently assumes at least one language is English for General. The set of languages is not even necessarily static, like a multilingual API for machine translation.
  • Despite how the question was posed, maintaining language-specific registries is not national or geographically regional. Canadian English speakers are unlikely to object to an Australian English reviewer in General. When international cooperation does break down, we can’t presume people would rely or be allowed to rely on a hostile foreign nation that happens to speak the same language.

If reproducibility comes from package caching and registry maintenance policies rather than packages being restricted to one registry, then that could be an argument against distinct resources. A separate organization reviewing packages in their native language doesn’t imply the policies and servers have to be different from General’s. It’d also be convenient for users to add GeneralEnglish and Geralportuguês from the same place as the English-dependency-only General.