Getting to 1.0 -- How can we get packages to "make that tag"?

cce · April 19, 2021, 7:29pm

So many core packages in the Julia community, such as DataFrames are still pre-1.0 well after they have become ubiquitous. This causes a bit of confusion. Is the lack of a 1.0 a suggestion to users that they shouldn’t depend upon it for production work? What is the commitment to keep the API stable? Having so many well worn packages not put out that 1.0 tag reminds of IETF calling everything a “Request for Comments”. At some point, it’s not really a request for comments, despite the title.

This lack of clarity is an active annoyance. When I mark a dependent on DataFrames, how should semantic versioning work? does everyone have that shared understanding? How about dependencies of DataFrames… are their API going to be stable? This sort of low-level churn is tiring, especially when you want bug fixes and improvements, but don’t want any big breaking changes.

Stable, yet pre-1.0 packages, are even more annoying once dependent packages enter the picture. Just today, I noticed DataFrames was somehow downgraded in a local environment. Here’s an example of what changed when asked for a newer version…

(@v1.7) pkg> status DataFrames
      Status `~/.julia/environments/v1.7/Project.toml`
  [a93c6f00] DataFrames v0.21.8

(@v1.7) pkg> add DataFrames@0.22
   Resolving package versions...
    Updating `~/.julia/environments/v1.7/Project.toml`
  [134e5e36] ↓ Catlab v0.12.0 ⇒ v0.11.2
  [a93c6f00] ↑ DataFrames v0.21.8 ⇒ v0.22.7
  [7f904dfe] ↑ PlutoUI v0.7.1 ⇒ v0.7.6
  [08abe8d2] ↓ PrettyTables v0.12.1 ⇒ v0.11.1
    Updating `~/.julia/environments/v1.7/Manifest.toml`
  [324d7699] ↑ CategoricalArrays v0.8.3 ⇒ v0.9.5
  [134e5e36] ↓ Catlab v0.12.0 ⇒ v0.11.2
  [a93c6f00] ↑ DataFrames v0.21.8 ⇒ v0.22.7
  [069b7b12] ↓ FunctionWrappers v1.1.2 ⇒ v1.1.1
  [d96e819e] + Parameters v0.12.2
  [7f904dfe] ↑ PlutoUI v0.7.1 ⇒ v0.7.6
  [2dfb63ee] ↑ PooledArrays v0.5.3 ⇒ v1.2.1
  [08abe8d2] ↓ PrettyTables v0.12.1 ⇒ v0.11.1
  [189a3867] ↑ Reexport v0.2.0 ⇒ v1.0.0
  [3a884ed6] + UnPack v1.0.2

This is somewhat confusing to me. I guess some other package may have been keeping DataFrames back, which was preventing me to see PlutoUI fixes? Yet, when DataFrames was manually upgraded, caused PrettyTables and Catlab to downgrade. In any case I’m kinda confused. Regardless, figuring this out seems like re-arranging deck chairs.

What exactly is the reluctance of so many to “just tag it”? One part is perfectionism… well, I get that. But there is a more technical reason for the reluctance. Once you tag v1.0, what is the next release? In theory you’d want to do a v2.0 for the next breaking change. But, the next release probably won’t be something one would want to call production. So, once you’re at v1.0, you’re kind of stuck. Or so it seems. Perhaps this is where the reluctance comes from?

I have a suggestion. Once you’re stable enough for a release people are actually using, get it to v1.0. Prune the code of experimental stuff. Get documentation to release quality. Get the coverage tests up there. Tag v1.0 and… have a small party. Take a breather.

Then, go back before the prune and fork, given it another name (say MyProject2). Then, put that project at a more respectable version number… v0.1 … that signals your willingness to move fast and break things. Work in this project for as long as needed, till it stabilizes. With a different package name, your users can even install them side-by-side. Backports are even possible. Then, once it’s stable, merge your changes into the main project, and release v2.0.

pdeffebach · April 19, 2021, 7:32pm

Fwiw, both DataFrames and PrettyTables will be tagging 1.0 in the next few weeks. So people definitely understand the issue and are working towards it already to some extent.

dilumaluthge · April 19, 2021, 8:00pm

See also: How can we encourage Julia package developers to release version 1.0.0?

mbauman · April 19, 2021, 8:03pm

There are definitely projects/packages that could use more nudges towards tagging 1.0. Calling out DataFrames, however, feels wildly unfair at this moment as they’ve been explicitly working towards 1.0 and are so very close to the finish line.

I get that navigating package compatibilities and downgrades can be annoying and opaque — and I think that’s at the root of your frustrations. Your experience likely would’ve been the same had DataFrames v0.21 and v0.22 been named v1 and v2. Some tooling improvements could go a long ways here, I think.

cce · April 20, 2021, 3:38am

@pdeffebach & @mbauman Oh dear. I certainly didn’t mean to disparage DataFrames – and I’m thrilled to see 1.0. I enjoy using Tables very much, and look forward to using PrettyTables. We have a fantastic community with some amazing contributors.

@dilumaluthge Thank you for pointing to this other thread; I found tkf’s response informative, it references ticket #33047 and a Pkg3 thread discussion thread.

What I’m asking is… what makes those who have well-utilized packages so reluctant to tag 1.0 ? Is there another way?

I don’t think this is necessarily true. To many, there is a qualitative meaning of zero-point releases that indicates that the work simply isn’t ready to be maintained. While, once v1 is released, one expects a bit more stability. Once you tag v1.0 I think some may feel more constrained, introducing far less churn (and, also, less innovation). Perhaps this expectation and limitation may be what keeps many packages to zero-point releases, even well after they are broadly adopted?

Could we think of a different workflow?

Let me pick on myself. I’m reluctant to put HypertextLiteral at v1.0 – mostly, at this point, because I’ve not gotten enough community feedback. In particular, there are a few design choices I’m not so sure about… so, do I tag, or do I wait?

I think I should tag v1.0 soon. HypertextLiteral has reasonable documentation, test coverage; it is beyond minimum viable, and it has a growing user community. I think a v1.0 tag brings with it a commitment to only accept bug fixes or incremental improvements if they won’t break compatibility.

But then, what for the “next version”? What about the experimental features I’m musing about? With existing conventions, I’m put in a bit of a dead spot. How do you release new improvements, that might break stability? Is it v2.0 that you recommend people to -not- upgrade to?

So, taking inspiration from @quinnj 's JSON, JSON2, and JSON3 packages. Perhaps I should think of this as a fork instead? Just like JSON2, I might call it HypertextLiteral2.

The v1.0 series remains quite stable in a branch, and users don’t need to suffer from the churn.
The experimental version can start at v0.1 which indicates that it is likely going to be unstable (or even abandoned!).
Those that want to provide feedback to the experimental version could install a different package. In fact, they could even install them side-by-side… if I don’t export too much.
They could even exist in the same github repository, just in different branches.
Once the experimental branch is stable enough, it could then be merged back into the main project, producing v2.0.

Perhaps convention like this might help address a “reluctance gap” by providing a way forward that isn’t restrictive. More broadly, perhaps tagging Foo v1.0 is currently not an option if the author expects very significant changes in the coming year. However, perhaps by using a new package name, Foo2, they could starting from v0.1 again, and then, once it’s good enough for a general release, jump right to v2.0, and pull the entire updated version into the main repository.

Skoffer · April 20, 2021, 4:01am

Well, names like Foo2 are forbidden, so it is not an option. JSON is a bad example that will never happen again.

Oscar_Smith · April 20, 2021, 4:08am

Wait, why are these names forbidden?

cce · April 20, 2021, 1:21pm

So, here’s the problem I’m solving. Let’s say I release HypertextLiteral v1.0. I’m now working on incremental previews for my next major release. These aren’t yet v2.0, but yet, they certainly are not v1.X since they are breaking changes. How do I package them in the Julia ecosystem? I’m suggesting I could use HypertextLiteral2 package name, but with versions v0.1, etc. Then once it is stable enough, I can release HypertextLiteral v2.0 after a merge of the fork.

The usage of postfix numbers in this proposed situation are different from JSON. I think JSON is three disconnected packages, not a major revision of the same package. So, I agree with the challenge of JSON2 and JSON3, they should have been more specific names. However, I disagree that this should lead to a blanket ban of using 2 as a postfix to construct a new package name.

mauro3 · April 20, 2021, 1:37pm

Are you requesting that the package manager also allows to release v2.0.0-alpha, …, v2.0.0-rc1, etc? That seems too complicated to me. Can you not just live on master until the changes stabilized, then tag v2.0.0?

cce · April 20, 2021, 2:04pm

No. Not at all. I’m expressly suggesting that one could use numeric postfixes, e.g. HypertextLiteral2 to accomplish provisional releases of a subsequent major release.

I don’t think so. It’s important to get community engagement, and use of packages is essential for that.

Perhaps this is exactly why we have so many packages suck in 0.x releases – there is no mechanism to handle pre-release packages (for a major release v2) once you tag v1.0.

GunnarFarneback · April 20, 2021, 2:32pm

Well, this is unorthodox and might be difficult to sell pedagogically to your users, but you could actually go back to 0.x versions when preparing for 2.0. (And you almost certainly should do it on a branch.)

A more natural approach would be to tag (git tag, not register) pre-release versions with whatever naming scheme you like and ask users who want to test it to install with ]add Package#tag. These versions would live outside the registry.

Or you just ask them to install Package#master and upgrade periodically to enjoy the bleeding edge.

tbeason · April 20, 2021, 2:56pm

I also don’t understand why you cannot just tag 1.0 and then have breaking changes on master or another branch as you move toward 2.0? Pkg can handle branches or other tags naturally, as others have already said, so there is no need to move all of that to a separate repo – that would create lots of additional headache and confusion

kristoffer.carlsson · April 20, 2021, 3:08pm

No.

The commitment is to tag a breaking release when the API change. The rate such tags will come has to be communicated explicitly by the project and cannot be encoded in the version number.

See 6. Compatibility · Pkg.jl

Until the authors of them decide to make breaking changes. Also not something that can be encoded in a version number.

This has nothing to do with being pre -1.0 but just a consequence of a package tagging breaking releases.

In summary, there seems to be a lot of questions here but none of them seem really related to tagging a 1.0. What you basically say is that you want packages to make fewer breaking changes. That is a very valid position. But it is the way you want packages to be developed that you want changed, not if a package go from 0.18 to 0.19 or 18.0 to 19.0.

halleysfifthinc · April 20, 2021, 3:14pm

Here, you highlight the common notion that there is something “special” about version 1.0. I’m not convinced that with the Julia Pkg flavor of SemVer (referencing that Pkg treats minor version bumps as breaking pre-1.0) there is much basis for this special treatment.

I may be in the minority, but I think versioning should be primarily about API compatibility and breaking changes. Assumptions about stability and frequency of breaking changes are somewhat external to the version numbers. A quick look at the package dependents and the project history is going to be a more reliable method to gauge whether a package is stable, etc. Using a number to signal maintenance willingness, frequency of breaking changes, etc seems to me like a misuse of versioning.

Using DataFrames as an example, JuliaHub shows 400 dependents, 1k stars on GitHub, actively maintained, 99% test coverage. DataFrames is a very widely used and depended package, releasing v1.0 provides little tangible benefit.

jw3126 · April 20, 2021, 3:52pm

I think that is a resonable approach.

cce · April 20, 2021, 4:10pm

I’m not suggesting this must be done in a separate repository. I’m suggesting is that package registration of a 2nd version of a package could use PkgName2.

My question is, how do I package development releases of a second version of my package, after I tag v1.0 and before I tag v2.0.

This is a technical view. There is also a social perspective. There is something special about v1.0, or else we wouldn’t have so may packages stuck in pre-1.0 versions.

I think there’s a significant social signaling challenge: one where the package is broadly adopted, but where the developers are planning lots of breaking changes still. In this case, if you tag v1.0 you end up in a dead zone before you can tag v2.0 since there’s no good way to distribute pre-releases of the second revision of your package.

Oh, I think there’s a significant benefit here. It signals a higher bar for breaking the interface, and increase the ability of dependent packages to be used together. Tagging v1.0 can and is often used as social marker of how the package maintainer intends to work with others.

In this methodology, if breaking changes to the DataFrames API were needed, they might decide to release DataFrames2 to validate those changes, rather than jump to DataFrames v2.0 (which may be necessarily less stable than v1.0). Then, over another year of refinement, people could still use DataFrames without conflicting with packages that want to use DataFrames2.

ericphanson · April 20, 2021, 4:12pm

I don’t think they are strictly forbidden. They won’t be automerge’d because the name is too similar to an existing package name, but they can still be manually merged (note that the distance-similarity check is mostly to avoid typosquatting). I think @Skoffer is referring to a particular situation involving forking a package due to percieved non-responsiveness of the maintainer which is kind of a different situation (the registry maintainers were suggesting to a wait a bit more for a response).

Skoffer · April 20, 2021, 4:38pm

Well, I should correct myself: it is forbidden to make packages with names similar to packages of other authors. So, no DataFrames2, CSV2 etc which is probably reasonable. Now, when you are an author of both packages situation is in a gray zone, I suppose. There are no explicit rules in either to allow or forbid you from registering such a package. But what is certain, you should go through a process of explaining why you want to do it (no auto merging) and it is probable that even if such packages are allowed at some point, there can be restrictions added later. So, one should use this approach at its own risk.

Topic		Replies	Views
`FillArrays.jl` is going to 1.0 Package Announcements package	10	1775	April 6, 2023
Upgrading version of DataFrames (stuck at 0.21.8) New to Julia	5	776	May 27, 2021
Please, consider putting up a CHANGELOG file after moving to 0.7 Community	19	1466	July 26, 2018
Guidelines for committing to v1 release Community	25	1466	February 15, 2020
Pkg version list? General Usage question , package	9	1018	February 21, 2022

Getting to 1.0 -- How can we get packages to "make that tag"?

Related topics