Making and using alternate or additional registries with new Pkg

I’d like to create a curated registry for string related packages for v0.7/v1.0, including most of the packages in JuliaString (leaving out a few that are only there for pre v0.6 compatibility such as ICU.jl, StringUtils.jl, StringLiterals.jl), but also including packages that have been comply with all the requisites that were required for being registered to METADATA, that have been shown to work for v0.7 without deprecation warnings, and work with AbstractString and AbstractChar types, such as provided by ChrBase.jl and StrBase.jl.
There are a number in BioJulia and also in people’s own repos, such as StringDistances.jl.

I’d like to know what is involved to create such a registry (should it be in a repo named JuliaString/StringRegistry.jl?), and also what a user would need to set things up so that that registry would be used (and take precedence over Uncurated).

Thanks for all the interesting possibilities opened up by the new package manager, @StefanKarpinski and @kristoffer.carlsson!

I appreciate the enthusiasm, but please do recognize that we’re only two people (and @kristoffer.carlsson is on vacation this week) and we’re trying hard to deal with the large set of issues and improvements brought up by the very welcome onslaught of people trying out the new package manager in 0.7-alpha for the first time. I’m afraid we don’t have time to document the new registry format or help you create one right now. Once everything has settled down and things are working with the main registry, we will certainly get around to documenting that, but for now, you’ll have to look at Uncurated and mimic it and see how it goes.

4 Likes

Yes, and kudos to both of you for all of this work. Like always, my questions & bringing up issues are in no way meant as any criticism what what you’ve accomplished, I just want to be able to use all of the new functionality as best possible, and hopefully showcase some of the new capabilities of Pkg3 in doing so.

Do you have any pointers to scripts / etc. that you use to maintain the Uncurated registry that I could look at?

Thanks again! The impending birth of Julia v1.0 is very exciting!

Uncurated is generated from METADATA.jl by the set of scripts in the stdlib/Pkg/bin directory.

1 Like

The main thing that’s a little tricky to understand is the general data compression scheme for describing properties of various package versions, e.g.:

https://github.com/JuliaRegistries/Uncurated/blob/e6b846888c9f03e25405850eead061cf1d0846dc/A/ACME/Compat.toml

To apply this to a specific version of the ACME package—say version 0.7.1—you go through and for each stanza, see if the version range that identifies the section header includes the version number. The data for 0.7.1 is the union of key-value pairs for all of the sections that include the version 0.7.1, in this case:

ProgressMeter = "0.2.1-0.5"
IterTools = "0.1-0.2"
DataStructures = "0.2.9-0.8"
julia = "0.6-0.7"
Compat = "0.64-0.68"

These give the ranges of versions of these dependencies with which ACME 0.7.1 is compatible. These ranges only apply to actual versions of these dependencies within the same registry, so while they are expressed in compressed format, they are actually effectively just lists of specific registered versions of these other packages.

It is illegal for any version to be included in multiple stanzas which have key collisions so the ordering of the stanza doesn’t matter, nor does their specificity.

This same general compressed format is used for all of the per-package-version files, including: Versions.toml (although this is degenerate since each key-value pair is unique to an individual version, but it’s still a form of this format), Deps.toml and Compat.toml. The Package.toml file is the only exception since it describes the package itself and not data about individual versions of it.

Another example, decoding the Deps.toml file for ACME 0.7.1:

https://github.com/JuliaRegistries/Uncurated/blob/e6b846888c9f03e25405850eead061cf1d0846dc/A/ACME/Deps.toml

For 0.7.1 this decodes to:

Compat = "34da2185-b29b-5c13-b0c7-acf172513d20"
ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca"
IterTools = "c8e1da08-722c-5040-9ed9-7db0dc04731e"
DataStructures = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8"

This is what would go in the [deps] section of that version’s Project.toml file (if it had one).

2 Likes

One thing, Uncurated has things split up by the first letter of the package, but for something like I’m trying to do, a small (probably no more than 50 for quite some time) number of packages (including ones that are not in JuliaStrings, such as StringEncodings and StringDistances), most everything starts with Str, so that isn’t that helpful.
Is that layout a requirement? It looks like it puts the actual relative location of the registry information in the Registry.toml file. I’m wondering, if the path key is not present, if it could default to the package name? (i.e. for small curated registries, just have them all at the top level).

Thanks again for all the info during this very busy time for you!

The relative path is whatever is in the Registry.toml file so you can choose whatever layout you prefer. I’m not sure about having a default, since it seems easy enough to just specify it. On the other hand, the non-sharded layout does seems like a reasonable default.