Do we need a policy against one-liner registered packages?

I’ve seen a few one or two line packages registered recently. I think the registration process should filter such out, as too soon. WIP is ok with a bit done, but it seem can be misused.

4 Likes

Do these one or two lines do something useful? If so, why shouldn’t they be allowed to exist?
Length is not always a good proxy for usefulness.

3 Likes

Well, I have a registered package that could technically count as zero lines (top level module is empty) that does something useful. Still chances are rather high that extremely small packages aren’t actually useful to anyone but the author (if even that) and never will be, so I don’t see a problem if such packages would be excluded from automatic merge. On the other hand chances are already high that those packages get flagged by one of the existing auto-merge checks.

The few I’ve seen, no, even the example.jl package copied, or some very limited code, such as:

my_f(x,y) = 2x+3y
derivative_of_my_f(x,y) = ForwardDiff.derivative(x->my_f(x,y),x)

I’ve now seen it explained it was a test, and I expect a package ported from Python so maybe I worry too much. I don’t believe there are “auto-merge” checks for such.

Also in another case told WIP.

Now I’m curious, I didn’t locate it, would like to see such a counterexample…

https://github.com/GunnarFarneback/DynamicallyLoadedEmbedding.jl

Saw this. I’m not a fan of “registering this package name because I’m going to build it later”.

14 Likes

Does General not have a policy on name squatting? I just assumed it did because of how extensive the list of checks in https://github.com/JuliaRegistries/General#automatic-merging-of-pull-requests is, but couldn’t find any mention on a re-read.

Rust has a HUGE *** problem because a few dudes registered like O(100) empty packages (“Contact me if you want to use this name”, crates.io) with popular names. See the later discussion on rust community github.

I sincerely hope we won’t go down a similar road.

6 Likes

Yikes. That looks like a mess. I think the vast majority of empty or one liner packages I’ve seen registered in Julia are truly from people that fully intend to do something with it. However, just because you can’t find package “X” doesn’t mean someone isn’t working on it. Why should the first person to type “@JuliaRegistrator register” have precedence over someone who has an actual working product a month later?

I’m not sure if there should be a policy on this or not. I just feel like it’s decent human behavior.

7 Likes

Maybe just block automatic merging for packages smaller than X ? So they can be dealt on a case by case way ? (but that would require people - general regostry maintenars - having the time to then actually check… unlikely)

1 Like

Remembered I was thinking of https://github.com/JuliaRegistries/General/pull/13791, which for whatever reason received human scrutiny despite having working code. Assuming that process scales, it would be nice to have for new packages.

Maybe just have an explicit policy that says “Maintainers may de-register a package if it has been registered for more than one month, does not do anything useful, and somebody else has asked to have the name for a package that they can show to do something useful and where the name would fit.”

This would reduce the pay-off of name-squatting to where it’s not worth the effort, so the policy would probably never have to be enforced.

9 Likes

If we could get this to work I think it would be the perfect solution, but you have to be very explicit with what this phrase means: “can show to do something useful where the name would fit” .

One good place to start is to enhance the package contribution docs to discourage things we don’t like, and set an expectation of what a registered package should look like.

3 Likes

To me, this is an example where the Rust RFC process (while many times working great) seems awfully bureaucratic and paralyzing. If someone went and registered hundreds of empty packages in General to name squat you can be sure they would be cleared out within days.

8 Likes

exactly, thus I have raised concerns about this a few times since we really need to have a plan in place before Julia becomes so popular (finger crossed) that the general registry has more activity than a handful of people can just scroll through casually.

Fortunately, suspicious behavior in new pkg registration is fairly easy to filter out.

I just want to echo the idea that I am horrified by the possibility of people name squatting in the Julia General Registry. :frowning_face:

3 Likes

Concerning the OP, I think that one can have a manual garbage-collection of packages if that becomes a problem. But I want to look at that from another perspective.

It is possible, and likely, that those people do not have bad intentions. We have seen many threads here about people getting confused about what is a module and what is a package. What gets precompiled, what does not. What you can put as a dependency of your “real” package, what you can’t.

For instance, I have just written a small function to compute block averages for the data my students are generating. It is a simple 60-line module. I just thought: I would like that my students could install this and put this as a dependency of their projects if they want. I would also want that to be a dependency of another package I have, which is registered.

(ps: I know that they can install the package using the full github url of the package, even it it not registered. But I cannot use it as a dependency of other packages, at least I think so).

I am not sure which are the options for that. I would like them to install the “package” with ] add BlockAverages, and would like that they can update the package as any other package. I would like it to be automatically installed if someone installs my registered package.

Yet, I do not feel comfortable in registering that, because it is too simple. I was just thinking that it would be very nice if I was able to add it to the general registry under a category which is my github. My github name is m3g, and I would like to be able to install some packages using ] add m3g/BlockAverages.

Being in the general registry provides many advantages, and being under the name of my github account would allow me to register whatever I feel is useful without having the feeling that I might be taking a name that can be better used for some other more important package.

Thus, I think that functionality, that is, the possibility of adding packages to the registry under a name to avoid name conflicts would be very useful.

I do not know if having a lot of packages, simple ones, in the general registry is bad for any reason except the name conflicts. If not, I think that flexibility would be a nice way to research groups, maybe companies, to share their developments without poluting the namespace of the packages that people feel are really of general use.

Of course I could register an ugly name like m3g_BlockAverages, but that is not nice either. I have thought of registering something like M3GTools and keep adding things there, but that rapidily becomes a frankenstein of useless code mixed with useful tools that nobody else outside the group will never use and will not possible be a dependency of a real public project.

With that option, I think we would see many more small packages registered, useful, and even more modularity and code-sharing in Julia.

8 Likes

To systematically share packages with a group of people (instead of everyone), you could create your own separate registry which, in principle, isn’t too hard (see my presumably outdated notes). The students would then only have to ] registry add ... your registry and could use the ] add ... functionality afterwards. However, in practice, creating and maintaining a custom registry (ideally with a RegistratorBot and such) can be a bit tricky. Would be great if we could simplify and document the process further.

6 Likes

That can be a solution if:

  1. It is easy enough (I would cry for “automatic”, why not? Every github user automatically being considered a registry?). Package management, while “easy”, gives us already a reasonable amount of extra work.
  2. That a package in the personal registry could be added as a dependency of a package of the general registry. (I don’t know if that is possible or if it works well for someone who didn’t add the registry manually).
  3. Problematic conflicting names do not exist. For example, if my package has the same name of a package in the general registry, how do I add/use it if both are installed? (I really like the idea of ] add m3g/MyPackage and using m3g/MyPackage, or m3g.MyPackage.
  4. Of course, that sharing the code on that registry with other people (the world) is also easy enough.

edit: one extra note: Someone else might read think that I should stop being lazy :slight_smile: , follow the guide and create the personal registry. I want to just point that making that really easy, if not automatic, I think is a good thing for general code sharing in Julia, it will give more freedom to people to partition their code into reusable blocks, and minimize these naming conflicts (the case of the original thread).

2 Likes