Best way to have GC manage freeing C-allocated storage

jlapeyre · August 24, 2022, 2:45am

What is the best (or a good) way of solving the following very generic problem? I’d like to

call obj_alloc() and obj_free() in some C library from Julia. On the Julia side, work with Ptr{obj}. It can be considered opaque, it is only passed to other functions in the same library.
Have Julia GC handle calling obj_free for me, probably with a wrapper.
Have the simplest interface to the wrapped object possible. The user shouldn’t have to know where it comes from.
When using the obj, have as small a performance degradation as possible compared to using Ptr{obj} directly.

So, in particular, things like wrapping the alloc in a try block and freeing in finally, is not what I’m looking for.

For example, here is an attempt for random number generators in GSL (gnu scientific library)

using GSL: GSL

struct RNG
    x::Ptr{GSL.gsl_rng}
    _y::Ref
    function RNG(rng_type=GSL.gsl_rng_taus2)
        _rng = GSL.rng_alloc(rng_type)
        mrng = Ref(_rng)
        rng = new(_rng, mrng)
        finalizer(x -> GSL.rng_free(rng.x), mrng)
        return rng
    end
end

(rng::RNG)() = rng.x

Comments

This apparently uses more memory than just using Ptr.
In some tests generating uniform random samples, I see no performance degradation.
If I use a mutable struct as a wrapper, instead of struct, there is a loss of performance.
I use Ref above for convenience, but a mutable struct wrapper also works for the second field.

It seems like there must be a simpler way, but I don’t see it.
Here is a more generic version

struct AllocFree{T, AF, FF}
    x::Ptr{T}
    _y::Ref
    function AllocFree{T,AF,FF}(args...) where {T, AF, FF}
        _obj = AF.instance(args...)
        mobj = Ref(_obj)
        obj = new{T,AF,FF}(_obj, mobj)
        finalizer(x -> FF.instance(obj.x), mobj)
        return obj
    end
end

(obj::AllocFree)() = obj.x

Then

julia> using GSL;

julia> mkrng(t=GSL.gsl_rng_taus2) = AllocFree{GSL.gsl_rng, typeof(GSL.rng_alloc), typeof(GSL.rng_free)}(t);

julia> rng = mkrng()
AllocFree{gsl_rng, typeof(rng_alloc), typeof(rng_free)}(Ptr{gsl_rng} @0x0000000002677af0, Base.RefValue{Ptr{gsl_rng}}(Ptr{gsl_rng} @0x0000000002677af0))

julia> GSL.ran_flat(rng(), 0.0, 1.0)
0.18691460322588682

mkitti · August 24, 2022, 3:47am

Finalizers can be problematic if it is an issue for the free to be called from a different thread.

If possible, I would encourage the use of the do syntax that implements the try - finally method you are avoiding. This hides the free logic away from the user while also making the free occur in deterministic time.

function use_obj(f::Function, args...)
   obj = obj_alloc(args...)
   try
       return f(obj)
   finally
       obj_free(obj)
   end
end

use_obj(args...) do obj
   # use object here.
end

Now to address your example, note that Ref is an abstract type.

julia> isabstracttype(Ref)
true

julia> r = Ref(5)
Base.RefValue{Int64}(5)

julia> typeof(r)
Base.RefValue{Int64}

julia> isabstracttype(typeof(r))
false

julia> isconcretetype(typeof(r))
true

I recommend making the field concrete.

struct RNG
    x::Ptr{GSL.gsl_rng}
    _y::Base.RefValue{Ptr{GSL.gsl_rng}}
    ...
end

struct AllocFree{T, AF, FF}
    x::Ptr{T}
    _y::Base.RefValue{Ptr{T}}
...
end

ikirill · August 24, 2022, 4:00am

What is the purpose of storing the same pointer in two different fields, versus having one field x::Ref{Ptr{T}} or having a mutable struct with a field x::Ptr{T}?

Can you give an example of a benchmark that experiences this loss? Because it’s possible it could be fixed by making the struct a concrete type.

When you’re accessing objects like this in Julia, I think it is a little more idiomatic to follow the convention of Ref and override obj[] instead of obj(). I.e., you’re getting the “only” index present in obj, rather than calling it as a function.

Note: I’ve only ever used mutable struct Obj{T}; ptr :: Ptr{T}; end when wrapping my own stuff, and just never ran into any problems I could notice, so I’m curious about this.

mkitti · August 24, 2022, 4:53am

He is potentially avoiding one level of indirection with the first field x. The memory layout of the struct is just two pointers. A mutable struct wrapping a pointer is a really a pointer to a pointer.

The second field mainly is needed to create a mutable object that Julia will try to garbage collect. Technically we could simplify by just making _y a reference to nothing while capturing the pointer within the anonymous function.

struct RNG
    x::Ptr{GSL.gsl_rng}
    _y::Base.RefValue{Nothing}
    function RNG(rng_type=GSL.gsl_rng_taus2)
        _rng = GSL.rng_alloc(rng_type)
        rng = new(_rng, Ref(nothing))
        finalizer(_ -> GSL.rng_free(_rng), rng._y)
        return rng
    end
end

jlapeyre · August 24, 2022, 5:48pm

I didn’t realize that this is a problem, or even think about. But, I’m not surprised. I’ll have to learn more about it. I mean is it always a problem? If sometimes, then when?

Regarding Ref. I know it’s an abstract type; leaving the wrapped type out was intentional, as it seemed to make no difference in usage or performance (why would it), so it only adds a useless detail. But, I may have missed something.

EDIT: Something I missed, but you picked up:

x → FF.instance(obj.x)

Using x for two different things here is confusing.

jlapeyre · August 24, 2022, 5:53pm

Exactly, that might even be better. For example, this might make it a bit more clear why the field _y exists.

mkitti · August 24, 2022, 6:04pm

It depends on the C library that you got the pointer from. For example, the HDF5 C library is not thread safe by default. If I attempt to close a HDF5 object from another thread while performing another operation concurrently, it will crash stochastically.

Using finalizers for resource management is really not a great idea. This is a very old issue that has been the subject of much debate.

github.com/JuliaLang/julia

`with` for deterministic destruction

opened 01:24AM - 25 Jul 14 UTC

klaufir

speculative design

**Deterministic destruction** Deterministic destruction is a guarantee that some… specified resources will be released when exiting from the enclosing block even if we exit from the block by the means of exceptions. **Example** In python deterministic destruction is done using the `with` statement. ``` python with open("myfile","w") as f: f.write("Hello") ``` Considering the code above we don't have to worry about closing a file explicitly. After the `with` block, the file will be closed. Even when something throws an exception inside the `with` block, the resources handled by the `with` statement will be released (in this case closed). **Other languages** - C++ always had guaranteed destructor calls at scope exit for stack objects - even in the face of exceptions (see [RAII](http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization)). - In C# there is the [`using` statement](http://msdn.microsoft.com/en-us/library/yh598w02.aspx) for this same purpose. - Java has [try-with-resources](http://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html) since Java7. **Julia?** It is my firm belief that Julia also needs to have a feature supporting deterministic destruction. We have countless cases when a resource needs to be closed at soon as possible: serial ports, database connections, some external api handles, files, etc. In cases like these deterministic destruction would mean cleaner code, no explicit `close()` / `release()` calls and no sandwiching the code in `try` .. `catch` .. `finally` blocks.

github.com/JuliaLang/julia

stop using finalizers for resource management?

opened 02:39AM - 09 May 15 UTC

JeffBezanson

speculative design

Finalizers are inefficient and unpredictable. And with the new GC, it might take… much longer to get around to freeing an object, therefore tying up its resources longer. Ideally releasing external resources should not be tied to how memory management works. We are already not far from this with the `open(f) do` construct. I think that and/or `with` should be used. Perhaps there could be some other mechanism for registering files to close eventually. Discussed this with @carnaval .

mkitti · August 24, 2022, 6:42pm

A recent development that has been incorporated in the Julia code base is eager finalizer insertion. Essentially, if we can prove that the object will go out of scope, then we should try to call the finalizer

github.com/JuliaLang/julia

Eager finalizer insertion

JuliaLang:master ← JuliaLang:kf/eagerfinalizers

opened 07:13AM - 11 May 22 UTC

Keno

+382 -79

This is a variant of the eager-finalization idea (e.g. as seen in #44056), but …with a focus on the mechanism of finalizer insertion, since I need a similar pass downstream. Integration of EscapeAnalysis is left to #44056. My motivation for this change is somewhat different. In particular, I want to be able to insert finalize call such that I can subsequently SROA the mutable object. This requires a couple design points that are more stringent than the pass from #44056, so I decided to prototype them as an independent PR. The primary things I need here that are not seen in #44056 are: - The ability to forgo finalizer registration with the runtime entirely (requires additional legality analyis) - The ability to inline the registered finalizer at the deallocation point (to enable subsequent SROA) To this end, adding a finalizer is promoted to a builtin that is recognized by inference and inlining (such that inference can produce an inferred version of the finalizer for inlining). The current status is that this fixes the minimal example I wanted to have work, but does not yet extend to the motivating case I had. Nevertheless, I felt that this was a good checkpoint to synchronize with other efforts along these lines. Currently working demo: ```julia julia> const total_deallocations = Ref{Int}(0) Base.RefValue{Int64}(0) julia> mutable struct DoAlloc function DoAlloc() this = new() Core.finalizer(this) do this global total_deallocations[] += 1 end return this end end julia> function foo() for i = 1:1000 DoAlloc() end end foo (generic function with 1 method) julia> @code_llvm foo() ; @ REPL[3]:1 within `foo` define void @julia_foo_428() #0 { top: %.promoted = load i64, i64* inttoptr (i64 4373384000 to i64*), align 64 ; @ REPL[3]:2 within `foo` %0 = add i64 %.promoted, 1000 ; @ REPL[3]:3 within `foo` ; ┌ @ REPL[2]:4 within `DoAlloc` ; │┌ @ REPL[2]:5 within `#3` ; ││┌ @ Base.jl within `setproperty!` store i64 %0, i64* inttoptr (i64 4373384000 to i64*), align 64 ; └└└ ; @ REPL[3]:4 within `foo` ret void } julia> foo() julia> total_deallocations[] 1000 ``` Thoughts @jpsamaroo @aviatesk ?

Yes. I’ll also point out that the field of a RefValue is also x, so tracking x here is particularly fraught.

julia> r = Ref(5)
Base.RefValue{Int64}(5)

julia> r.x
5

julia> r[]
5

The performance hit is in the finalizer which is hard to measure. Essentially, the compiler has no idea what is contained within the Ref, so you will end up getting dynamic dispatch on your free method from the finalizer.

struct Foo
    y::Ref
end
g(foo::Foo) = foo.y[]

julia> @code_warntype g(foo)
MethodInstance for g(::Foo)
  from g(foo::Foo) in Main at REPL[33]:1
Arguments
  #self#::Core.Const(g)
  foo::Foo
Body::Any
1 ─ %1 = Base.getproperty(foo, :y)::Ref
│   %2 = Base.getindex(%1)::Any
└──      return %2

jlapeyre · August 25, 2022, 3:35am

In any case, I think it’s safer (in terms of performance degradation), and easy, to make y inferrable.

I think the question of a canonical way to do this is still interesting. But, it’s clear it shouldn’t be recommended for general use at this point.

Also, A remark in a different comment above:

I think it is a little more idiomatic to follow the convention of Ref and override obj instead of obj()

I thought about that. I thought it might be confusing, that someone would think this is essentially dereferencing a Ref. But, now, I think I agree with you.

Topic		Replies	Views
C struct garbage collection not run frequently enough General Usage garbage-collection , mutable-structure , c , gc	28	443	July 14, 2024
Finalizer only works with mutable structs? General Usage	74	3540	December 21, 2019
Opaque pointers and ccall mem. management General Usage question	0	435	September 23, 2019
Avoid GC freeing Ptr{Nothing} General Usage	25	2105	May 29, 2020
Properly using `finalizer`, `ccall`, `cconvert` and `unsafe_convert` General Usage question , ccall	13	3267	October 3, 2017

Best way to have GC manage freeing C-allocated storage

Related topics