How to gracefully retire a package

Julia 1.11 has made PushVectors.jl obsolete. I wonder what the best workflow is to retire it.

I have read the previous threads on this, with various recommendations (eg 1, 2). I am still not sure how to proceed.

Specifically,

  1. I don’t want to make the package uninstallable on >=1.11, as it is a very harsh move (code is functional, and will work forever, just not needed after 1.11). So either releasing a new version that is capped, or capping it directly in the registry, is not a good solution, as it forces immediate work on others, instead of allowing them to pick a time for it.

  2. I considered a gentle warning message, eg as

    function __init__()
        if VERSION ≥ v"1.11"
            @warn """
                  The issue which required PushVectors.jl as a workaround has been fixed in
                  Julia 1.11, so consider removing it. See README.md.
                  """
        end
    end
    

    but having that show up constantly is a bit obnoxious.

  3. I will archive the repo, but AFAIK that currently does not do anything in Juliahub.

  4. Since this package has 2 dependents, I can just notify them directly. But that does not scale.

I think that in the early days of Julia, retiring packages was not an immediate problem, so it is understandable that this was not a priority. But now that the ecosystem is mature this is an issue we have to deal with.

Also, note that I am not talking about “hard deprecation”, ie making the package impossible or difficult to install. One of the great things about Julia is reproducibility: once code works, you can dig it out 10 years later and it will still work. Once in the registry, packages should remain installable forever, and ideally work on all future Julia versions to make upgrading easier.

10 Likes

In this case I would just make the maintenance status and the migration path clear in the README, and possibly update docstrings with relevant information.

4 Likes

A better way to do this IMO would be to emit the message during precompilation, that way you usually only get the message when you actually install the package. That’s what we went for when deprecating Formatting.jl, see this example: Formatting.jl/src/Formatting.jl at master · JuliaIO/Formatting.jl · GitHub

So in your case, I’d do

 if ccall(:jl_generating_output, Cint, ()) == 1
        @warn """
              The issue which required PushVectors.jl as a workaround has been fixed in
              Julia 1.11, so consider removing it. See README.md.
              """
    end
end

instead of using __init__.

9 Likes

Semi-related, but an edit of an old benchmark I had for setindex! vs push!-after-sizehint! corroborates the narrowing in performance advantage (push! used to be ~80μs), though setindex! is still far faster:

julia> benchmarkalloc3() # v1.11.0
push! after sizehint!:
  1.096 μs (3 allocations: 128.06 KiB)
  13.500 μs (0 allocations: 0 bytes)
setindex! an undef Array
  1.040 μs (3 allocations: 128.06 KiB)
  1.680 μs (0 allocations: 0 bytes)
push! PushVector with sizehint:
  1.129 μs (4 allocations: 128.09 KiB)
  11.300 μs (0 allocations: 0 bytes)
Benchmark code
using Benchmarktools, PushVectors

function pushtest(v, m) # after sizehint!
         for i in 1:m  push!(v, i)  end
         empty!(v)  # prevent memory leak, should keep capacity
end

function settest(v, m)
         for i in 1:m  v[i] = i     end
         v
end

function benchmarkalloc3()
         m = 2^14

         println("push! after sizehint!:")
         v = @btime sizehint!(Int64[], $m)
         @btime pushtest($v, $m)

         println("setindex! an undef Array")
         v2 = @btime Vector{Int64}(undef, $m)
         @btime settest($v2, $m)

         println("push! PushVector with sizehint:")
         v3 = @btime PushVector{Int64}($m)
         @btime pushtest($v3, $m)
         nothing
end

It’s buried in a long Rust blog thread, but the reason for that benchmark was that Rust’s analog of push! checks the allocated capacity and thus possibly skips the vector-lengthening code to the analog of a setindex! and incrementing the semantic length, so the performance difference would be far less if an analog of undef vectors were used in Rust. At the time, the working hypothesis was the ccall overhead, but it was unclear because of the C implementation of _growend!. The new _growend! and push! seems to do something like the Rust approach, but I can’t tell why the improved push! still has a ways to approach setindex!, maybe the extra number crunching to check the capacity? Maybe PushVectors can pull this off and it won’t be obsolete for a bit longer.

2 Likes

I think the biggest bottleneck with push is that the compiler doesn’t figure out how to vectorize it

I thought the same, but the difference remains if you fill the vector with rand(Int):

40.544 μs   # pushtest
19.999 μs   # settest

Below is the code. According to @code_native, settest is not vectorized in this case, contrary to @Benny’s original example.

function pushtest(v, m) # after sizehint!
    for i in 1:m
        push!(v, rand(Int))
    end
    empty!(v)  # prevent memory leak, should keep capacity
end

function settest(v, m)
    for i in 1:m
        v[i] = rand(Int)
    end
    v
end
1 Like

It seems that the difference simply comes from updating the length of the vector. With the new data structure PV that includes the length I get

36.622 μs   # pushtest
35.919 μs   # settest for new type PV
Code

I’m only showing the new code. I’m using Chairmarks.jl.

mutable struct PV{T}
    const v::Vector{T}
    n::Int
end

function settest(w::PV{T}, m) where T
    for i in 1:m
        w.v[i] = rand(T)
        w.n += 1
    end
    w
end

function benchmarkalloc3()
    m = 2^14

    println("push! after sizehint!:")
    v = sizehint!(Int64[], m)
    display(@b pushtest($v, $m))

    println("setindex! an undef PV:")
    v2 = Vector{Int64}(undef, m)
    w = PV(v2, 0)
    display(@b settest($w, $m))
end
1 Like

I think Oscar doesn’t mean that the arithmetic doesn’t vectorize, it’s that the compiler doesn’t realize it can vectorize push! calls themselves (presumably due to the branches inside that determine if the storage should be replaced or not).

1 Like

Honestly, the package was kind of a joke, I expected that the issue would be fixed much quicker. I am actually eager to retire it. Performance fixes should go to Base.

3 Likes

If I’m reading this right, majority of it is whether the particular calls around setindex! are vectorizable, and the rest is the increment of the length (likely on the heap) taking a surprisingly (to me) significant portion of the time. IIRC Rust’s Vec has its metadata on the stack and the buffer on the heap, but it sounds like the same sort of work needs to happen for mutation in a method via a mutable reference. Rust wasn’t as easily tinkered with, so the steps weren’t profiled at the time, and the simple code examples were probably also vulnerable to compiler optimizations like SIMD throwing off comparisons.

I think this is the right answer for now, and adding warnings on loading or pre compilation is a nice bonus, I think. If the question is what functionality would we like Pkg / registries to support, I have some thoughts.

There are a few different reasons a package might be deprecated or a package author might otherwise want to discourage people from using them:

  1. Functionality isn’t needed anymore due to improvements to Julia (as in this example).
  2. Functionality is better provided by other packages in the ecosystem (just had this come up with BioTools.jl).
  3. The author just doesn’t plan to support it going forward.

In all of these cases, I totally agree that working code should be allowed to continue to work, so capping Julia versions is a bad idea. But it would be nice to be able to add a Pkg nudge if someone tries to add or load a deprecated package as a direct dependency, either by asking for confirmation, or requiring an extra flag (eg ] add PushVectors errors, ] add PushVectors --force works).

I wouldn’t want to add friction if it’s coming in as an indirect dependency - unless there’s a security issue, I never want to be forced to get involved in decisions package authors are making.

1 Like

Separately, a bunch of posts here should probably be split into a different discussion, but I didn’t know if I should use a flag - I associate that with bad behavior and I didn’t want to imply that…

Yes, I think that this is the right solution. Perhaps an extra field could be added to the Project.toml, indicating deprecation, possibly with a message that Pkg.add could display.

As for existing dependents, I think that they are best notified via opening an issue.