Reduce package registration waiting period

Package registration is completely seperate from JuliaHub – in the end it’s just a PR that can come from any source.


The best argument against a waiting period is probably the case where you want to register n packages with dependencies on each other at the same time – this quickly escalates to a >3n days wait time (at least with auto-merge compliant PRs, I think).

5 Likes

For R CRAN, new packages AND every update can take multiple days. Hard to argue they don’t have a vibrant ecosystem.

13 Likes

I think misunderstand the registration process. Formally, no one is required to check the code (people can, if they want to, and then complain if they feel like it, but it is not formalized).

What happens is the following:

  1. you submit the registration request,
  2. people can comment, if they want to,
  3. if there are no objections about the name, your package is registered.

That is all.

I can’t speak for the registry maintainers, but I trust that they will do something if they find that this approach does not scale. I think there are plans to deal with this.

cf

4 Likes

Please kindly look up the facts about the registration process before engaging in the “Julia is doomed” line of reasoning.

Specifically, no one is deciding anything for anyone. If you comply with some minimal requirements, your package will be automatically merged.

Also, as pointed out multiple times on this forum: you can easily create your own registry where your rules apply. It will integrate seamlessly into the package manager.

4 Likes

You are right, I finally got accepted a package in R CRAN, but it was a long process. I like that in Julia is simpler. Actually it is simpler than in Python, because the tools in Julia, in my opinion, are a lot more intuitive, my first package in Python, GitHub - dmolina/cec2013lsgo: Package for using CEC'2013 Large Scale Global Optimization benchmark in Python, this benchmark is used also in CEC'2014 and in CEC'2015, with Cython dependencies, took me a lot of effort. Submit it to Pypi was simple and quick, but to create a setup.py with dependencies was not (at least the first time).

6 Likes

Please no. There aren’t that many really useful packages that should exist.

26 Likes

I think @Tamas_Papp linked the most relevant issue on this. There’s also volunteers that maintain the GeneralRegistry that have granted shorter waiting periods when there was a need. Can things get better? Of course. But I’m not sure a sweeping decision is the most useful path forward.

1 Like

Again (you seem to have missed this): people are not online and checking the registry PRs 24/7. A 72h window gives them time to respond.

(Also, regarding your tone: please keep in mind that you are talking about people who are volunteering their time for this.)

5 Likes

Julia doesn’t need to be so widely popular for it to be successful. It’s kind of nice to have a language more focused on scientific computing, while still being general purpose language. Julia is appealing to scientists and people who love Julia.

Why would it be a goal to have hundreds of package registrations per day? That’s not a good goal, that’s a sign that people aren’t thinking carefully about the creation of new packages and doing it on a whim. It shouldn’t be a goal to maximize the number of new package registrations.

13 Likes

We’re heard what you’re saying and if it seems to be a problem in the future, we’ll reconsider, but for now the three day waiting period is doing more good than harm, so we’ll keep it. Until that changes, I’m not sure there’s much benefit to debating it further.

I would observe that you seem to be coming from a perspective that the npm ecosystem should be emulated, but that’s pretty questionable from my perspective. I’d much rather have an ecosystem like R’s than JavaScript’s. There’s only a very loose association between the number of users—a proxy for the nebulous concept of “success” I guess—and the number of packages, and I cannot bring myself to see the millions of npm packages out there as a real success story.

30 Likes

Very humorous. I suspect R usage is at least 1000 times larger than Julia. Just about every Biology lab these days uses R for something. The NIH budget is quite small though, only around 5 times the entire NSF budget.

Indeed, rather than a gazillion individual packages each one doing some particular thing, it makes sense to have one package for some general types of things, and then add functionality to that.

Hear hear. R has an excellent ecosystem, plenty of diversity and many very high quality packages. In fact they require a lot of documentation that Julia doesn’t seem to require. You have to be fairly serious about creating a general purpose useful thing to get something into CRAN. That means most times you can library(xyzzy) and be sure that xyzzy is functional and there’s somewhere to read fairly comprehensive docs to find out what it does.

4 Likes

I will soon have to start considering stop using the general registry at all. I just don’t want to have my laptop filled with thousands of packages in my .julia\registries\General

Oh dear, this fast living world. 3 days of waiting to register a new package is already leading to such a discussion.

…and I read it from the beginning :see_no_evil:

Seriously, is this really a problem? I like the waiting period, I am sure it prevented a lot of one-shots in the past, where people might have submitted some rubbish packages without thinking further, just because no-one would have looked at it with a 0s-waiting time.

6 Likes

Just to make sure you know, there are only lightweight files there containing versions and compatibility information (16 kb per package), not the actual package code.

1 Like

Not disrespectful perhaps, but a bit dismissive. Have you considered the possibility that people might disagree with you without the cause being that we’re all brainwashed?

I don’t think you’re wrong - I think there are perfectly valid reasons for having zero waiting period. But there are also valid reasons for having a waiting period. There are trade-offs, and the people that manage the julia registry have decided the balance of trade-offs favors a waiting period. You’re free to disagree with this choice, but I think people are pushing back because you seem unwilling to acknowledge that there’s more than one way to look at it.

For the record, I do not believe that the people saying “you’re free to start your own registry” are being dismissive - it’s an acknowledgement of the fact that opinions can differ on this point.

This does seem a bit disrespectful, but I’m not mad - at least you haven’t called us toxic :laughing:. You’re free to disagree, and I think you raise some good points. It’s possible (likely even) that the people most likely to use julia and certainly those most likely to be active on the forum are those that generally agree with the approach of the language as a whole, which is a better explanation for the uniformity of opinion than that we’re all dim-witted.

14 Likes

I’ve read this and other expositions of the same concept. One of the more radical versions of it was from Erlang’s Joe Armstrong: Why do we need modules at all? It’s a very interesting idea, but ultimately, in my opinion, a bad one. It “works” in the same sense that assembly language works: it is possible to create working software using this approach. It fails, however, in the same way as assembly language: human brains don’t work like that. People think in terms of coherent collections of related functionality that are designed and intended to work together. They do not think in terms of an undifferentiated soup of millions of individual functions that have no rhyme or reason. There’s a great power to consistency and patterns. Have you ever just guessed how something should work, tried it and found that it does exactly what you expected? Isn’t that great when it happens?

This seems to be a misunderstanding. The way npm works is that each package that depends on some other package gets its own copy of the specific version of that dependency that it needs. So it doesn’t matter if A and B need different versions of C—they each get their own copy which can be different versions.

Julia’s package system works in completely the opposite way: only a single version of C can be loaded in the same process and A and B use the same copy, so both of them must be compatible with single shared version of C.

Various people have suggested that Julia’s package system should work like npm’s instead (nevermind that npm has regretted that and tried to change to a system where A and B can share a single copy of C when possible). However, this does not work well with multiple dispatch. If there are two copies of C and it defines some type T and some function f with methods for T. Suppose an instance of C1.T gets returned to A and passed to B, possibly via the main project which calls them both. Suppose further that this instance of C1.T gets passed to C2.f. What happens? It’s a method error: C2.f only knows about C2.T and knows nothing about C1.T. Worse, you get an incredibly confusing error:

julia> C.f(x)
ERROR: MethodError: no method matching f(::Main.C.T)
Closest candidates are:
  f(::Main.C.T) at REPL[8]:3

People occasionally get these errors now when they’ve accidentally reloaded a module and still have an instance from the old version of the module around. It’s very confusing.

Anyway, that’s a bit of a digression, but in the end, no, Julia packages do not work the way that npm packages do. Having lots of tiny packages is not a great fit for Julia since it makes it much harder to figure out a consistent set of package versions.

23 Likes

I know that. Currently ~90 Mb of them (and ~1 GB packages, mostly dependencies of dependencies of …)

1 Like

The number of small files is a problem though, especially on Windows. To mitigagte that, we’re planning on not extracting the registry at all in the future and just loading it directly from a compressed tarball. We have the technology these days.

14 Likes

You seem to be equating publishing a package with some sort of acceptance by the community. That’s not the case, as far as I understand, and publishing a package is not some “measure of success”.

You bring up NPM and PyPI as examples in your favor. I would suggest that both of these ecosystems are examples of environments we would do well to avoid. I have been arguing for years about dependency/ecosystem fragility and our need to avoid it in Julia. NPM is a disaster from a maintainability and security perspective, and those of us who work in high-security environments have had to deal with this pain (and most of us have chosen not to). This does not expand the use of Node in these workplaces; it absolutely hinders it. PyPI has had its own security issues with packages that got typosquatted and with packages whose authors either inadvertently or deliberately introduced malware. We have not, to my knowledge, had anything like that happen in Julia, though I suppose it’s only a matter of time, accelerated if we start accepting any package that gets submitted.

I would be in favor of two things:

  1. an official curated registry that sets a higher bar for inclusion than what currently exists, and
  2. changing the existing open registry’s requirements to include three things for a package:
    a) a README,
    b) a test suite (could be incomplete, but passing tests), and
    c) documentation.

There’s no harm in waiting 3 days for publication in someone else’s list. If people need to use the code, the wonderful Pkg system allows them to add directly via URL. Given what I’ve seen in the PRs lately, I would be ok with waiting a week, though I realize my position is a bit extreme. In any event the registry maintainers have a thankless task and should have an opportunity to ensure that their efforts result in a clean, coherent, and consistent Julia package environment.

25 Likes

And maybe a

  1. “Julia Approved” :wink:
1 Like