Feature request: unsafe_free!

try/finally blocks are not the answer for performance critical code

Yeah, I did actually think about using try/finally here but it was a bit too costly to build into @no_escape, though it is actually cheaper than one might think, e.g.

julia> function foo1(buf=default_buffer())
           @no_escape buf begin
               z = @alloc(Int, 10)
               z .= 1
               sum(z)
           end
       end;

julia> function foo2(buf=default_buffer())
           b = default_buffer()
           cp = Bumper.checkpoint_save(b)
           try
               foo1(buf)
           finally
               Bumper.checkpoint_restore!(cp)
           end
       end;

and then

julia> @btime foo1();
  12.717 ns (0 allocations: 0 bytes)

julia> @btime foo2();
  34.566 ns (0 allocations: 0 bytes)

julia> @btime foo1($(default_buffer()));
  5.610 ns (0 allocations: 0 bytes)

julia> @btime foo2($(default_buffer()));
  26.099 ns (0 allocations: 0 bytes)

so if you never actually hit the finally branch it’s not that expensive to have around so long as you’re not using it in the tightest of tight loops.

I could also make a varitant of @no_escape that does the finally branch automatically, e.g. it could be called @no_escape_finally or something, would that help?

Good idea.

3 Likes

Not for me specifically – I don’t have any problems with what Bumper.jl is doing, and the memory leak through exceptions was clear enough for me even without explicit documentation (but that does require some reading between the lines). I’d wait until one of your users requests that feature :wink:

But generally, Bumper.jl is no replacement for the unsafe_free! feature.

To give another example that is clearly incompatible with arena allocation. Collecting iterators of unknown eltype uses a very nice construction in base/array.jl:

function collect_to!(dest::AbstractArray{T}, itr, offs, st) where T
    # collect to dest array, checking the type of each result. if a result does not
    # match, widen the result type and re-dispatch.
    i = offs
    while true
        y = iterate(itr, st)
        y === nothing && break
        el, st = y
        if el isa T
            @inbounds dest[i] = el
            i += 1
        else
            new = setindex_widen_up_to(dest, el, i)
            return collect_to!(new, itr, i+1, st)
        end
    end
    return dest
end

We collect into dest until we hit an incompatible type; then we widen the result-type and re-dispatch. This clearly ends the lifetime of dest. We could introduce an opportunistic unsafe_free! into that case.

We could even now dest isa Vector && resize!(dest, 0), but that is imo inferior to real support.

2 Likes

try/catch/finally aren’t the only requirements for safe freeing; if user code you’re calling with the buffer captures it in a task and keeps it alive longer than the caller, you also hit UAF. Not having unsafe_free! eliminates that failure case completely, since GC will free it when the final task with a reference to it drops.

I know that this is not your motivating usecase, but it’s something that can very easily happen by accident.

That depends on the lifetime of dest in the caller; just inserting unsafe_free! may lead to UAF if the caller still needs dest.

But that is an internal helper function of collect. We control all the callers and they don’t need dest anymore. Indeed, what should they do with a half-initialized dest that has the wrong eltype for the job they wanted done?

Would you have had the same objection in 1.9 to a PR that inserts dest isa Vector && (resize!(dest, 0); sizehint!(dest, 0))? That also freed the backing memory.

UAF is easy to hit. Just run finalize on a BigInt. (finalize doesn’t even warn about that in the docs! It should be called unsafe_finalize!)

Yes, that’s why I said that it depends on the lifetime of dest in the caller - if the caller doesn’t need it anymore, only then are we able to free it early. My point is that purely from looking at that function, you can’t know whether that’s the case. Lifetime decisions are global, not local.

You control all the callers in Base, yes. But there’s nothing stopping someone from finding collect_to! and using it to write their own function.

Yes, grabbing Julia internals willy-nilly is discouraged, but it also happens all the time. The social contract as I understand it is that if subsequent releases break your code, that’s your responsibility. But it wouldn’t be great to weaken that to “if you use internals, you might get a segfault or use-after-free”, as it stands that’s generally only true for functions clearly marked unsafe, or written in C and called through @ccall.

So sure, this could be changed to read unsafe_collect_to! and so on. I’m trying to draw attention here to the fact that taking over memory management in a garbage-collected language very easily undermines assumptions which are otherwise safe to make about code, and should be approached with great care. I would argue that calling unsafe_free! on a passed-in parameter should be off-limits under any plausible circumstance. Maybe the body of the function could be moved into collect, and then it might be reasonable.

And this is the concern with adding unsafe_free! as a primitive ubiquitously available to all Julia devs. Decades of experience with C programmers have shown that you simply can’t trust them to use manual memory management correctly. Experience is of limited help, only a comprehensive test suite bringing in the usual tools can give confidence that nasal demons don’t lurk in the depths. Adding unsafe_free! would bring that charming quality to Julia.

1 Like