Reduce package registration waiting period

I generally agree with Stefan’s point that right now 3 days waiting time does more good than harm, but for future reference here are some use cases when it annoyed me pretty much:

Publishing dependent packages

Usually I have whole chains of packages, say, A <- B <- C, where A is a pure library, B depends on it and C depends on both and is private. To test C for production, I need to ]add them as if they have already been registered and even include concrete versions of A and B and run tests, but for 2 new packages it means up to 6 days before I can be sure C is ready to be deployed.

Perhaps I could use LocalRegistry and deploy without any waiting, but then I would have very real motivation to actually make these packages public later.

Figuring out structure of dependent packages

Another problem when developing multiple interdependent packages at the same time is figuring out what code should go where. In my day-to-day job in industry, where I mostly write in Python and Scala, we restructure packages approximately every 3 months. Sometimes such packages live for just a few days: we create a new package, try out new structure, realize its limitations and refactor again.

In open source and with public packages it happens more rarely, but still happens. For example, I remember Hadoop ecosystem maturing, new packages being separated, merged with others and deprecated pretty quickly. Usually these are not user-facing packages - no, that ones stay stable - but some helper libraries covering specific needs.

Moving common functions to a separate package.

Once I had a set of utils that I used in 2 or 3 packages. These were very simple things without their own common domain, e.g. @get - a macro similar to get, but not evaluating second argument unless needed, or macro @runonce which avoided re-evaluation of some pieces of code, etc. I thought about creating a package with some dummy name like LittleGoodThings and moving all the utils there, but I couldn’t come up with a reasonable description and decided simply to duplicate the code.

Announcing a package

You know this feeling when you are ready to release first version of a package you’ve been working on for several months. Over the weekend you finish last tests, write docs and encouraging text of announcement to post it here, on Discourse. Then you submit a PR and… wait until Wednesday until it gets merged. Over this time the enthusiasm fades, you get back to daily job and come back to the package in the middle of Friday. Not a big price for something you’ve spent on several months, but it also doesn’t encourage for more active development.


None of these use cases were really blocked by the waiting time, but it made things pretty inconvenient, so hopefully in some distant future this last issue will also be overcome.

3 Likes

While I agree that this can be annoying, and had to deal with this myself before, this is also a good thing. What’s good about it is that makes you think twice about registering a bunch of deeply nested packages. The longer wait expectation will force people to carefully plan the nested dependencies, and maybe there is a different approach for interoperability you think of, which you might not do if you can lazily regestier tons of nested packages all at once without waiting.

7 Likes

Just write the announcement on Sunday and save it somewhere.

6 Likes

really interesting discussions and both sides have valid points. what i like with debian packaging ecosystem is that they have very specific guidelines in submitting a package and because it is highly specific, they have tools to check if your package passes those guidelines. it checks for name conflict, documentation, etc. it is like a linter that checks common mistakes. we like coding because we are basically lazy people. we let our imagination automate the repetitive tasks and automation is a test whether we have clear idea of the guidelines we want to execute.

maybe we need a lintian tool similar to debian:
https://github.com/Debian/lintian

The general registry also has very specific guidelines, and the CI bot tells you what needs to be remedied in case of a failure. The waiting period just allows constructive comments beyond these requirements, but these per se will not prevent merging once reviewed. There are quite a few packages where the author insisted on the original name and it got merged.

That said, the comparison with Debian (or any major distro) highlights an important difference: all packages in the distribution are implicitly guaranteed some form of maintenance. Security fixes at the minimum, and usually critical issues are also prioritized. If the maintainer cannot/does not do this, others step in in a timely manner. Eventually, if upstream is abandonned and this becomes impossible, the package is removed from Debian (or in some rare cases, forked and maintained a bit more). Because of this, uniform standard are critical for Debian.

In comparison, the General registry is closer to a structured database of package metadata: it takes releases as given, and makes no explicit quality requirements of registered packages.

2 Likes

by the way, i like the implicit standardization when you submit your package for publication under the tutelage of @matbesancon. with the experience i had, i’m pretty sure those packages accepted for publication have great documentation and usability based on the help of the reviewers. peer-reviewing of package submitted is great and maybe if we can flag those packages peer-reviewed and accepted for publication, it can indicate some form of quality.

regarding debian packaging, i don’t mean to take it literally. if there is a tooling that helps packagers suggest the common style of naming, check for similar names, suggest categories, suggest similar existing package, etc., the actual submission will have less issues because the offline tooling already traps those common issues.

the idea of an offline package lintian checker is that new developers don’t have to wait submitting their package to general registry and be flagged that your package name should be changed, you need more docs, docstrings, etc. if they can run a lintian offline while developing a package, they can already incorporate these expectations incrementally and by the time they submit the package, it will be straightforward and fixes will be minor because they already developed the habit of julian way from the start.

5 Likes

running lintian to your package under development and lintian reminding you of the lack of docstring in function blah or lacking some examples in function blah will help the new developer gets the habit of working with julia packages. it can also help on those existing package by running lintian everytime one makes a release to make sure those docs/examples are there and new function names are without conflict with base packages, etc. lintian can just serve as a guide to developers but at least it can provide some standards to the most common expectations without re-reading the guidelines because lintian can check it for you.

I look forward to using this package :slight_smile:

1 Like

i’m tempted to implement this in PERL :joy:. it will parse the evolving guidelines and also record the names of the functions in base, create an implementation for each guideline (by regular expression, guideline syntax parser, etc) and check the package for any violations.

it can also include summary statistics: number of functions with no docstring/examples, number of functions similar to Base, number of functions with more than 10 lines of code ;), no. of function names not in standard format, no of global variables, etc.

1 Like

Exhibit A.
Empty packages with the most common names:
https://crates.io/users/swmon

Discussion threads in Rust community:

Mitigation: Maintainers manual intervention:

3 Likes

A couple of points:

Pkg has no problem with duplicate names, so squatting a name like this does not prevent someone else form using it. It prompts for which you want and in the future we could provide download stats and/or stars to help people decide. I doubt an empty package will get a lot of stars or downloads so it would not take long to rank higher than a squatter.

We should actively archive unmaintained or empty packages. We have already done one round of this for packages that don’t support Julia 1.0 at all; more archivings can occur in the future.

13 Likes

Can / should this be automated?

1 Like

Yes.

5 Likes

The choice to identify packages by UUID was really a smart one. It’s impossible to squat the UUID address space :wink:

What does it look like for the user when two packages get registered with the same name? Suppose you somehow want both is that possible? How does one go about using them?

1 Like

I’m still waiting for someone to insist on a particular “vanity UUID.” It’s not currently possibly to directly use two different packages with the same name form a single project. A project and one of its dependencies can do so, however.

1 Like

Sorry, I claim firsts on this. Your response to me at the time was “as long as you don’t complain if there’s a collision, go ahead”.

I never really followed through, but perhaps I should :slight_smile:

1 Like

I’m still waiting for someone to insist on a particular “vanity UUID.”

Wait no longer:

4 Likes

That’s not a UUID. :wink:

It obviously doesn’t conform to version 1, version 2, or versions 3 and 5. You could argue that it’s a version 4, but besides not actually being generated by any randomness, I’m guessing it doesn’t mark itself properly as a version 4 ID.

1 Like

Just a FYI, there is an open PR (https://github.com/JuliaRegistries/General/pull/16058) that proposes some changes to the policies and documentation of the General registry.

2 Likes