Avoid GC freeing Ptr{Nothing}

@lobingera, let me try to explain this here, as I’ve spent some time and already have asked @yuyichao too many questions while trying to understand the same thing :wink:

If for some reason in a function you need to operate on a raw Ptr to val you absolutely need to ensure that val is not eaten by GC while pointer is used (otherwise you’ll get segfaults, etc.). Simplified code:

struct ContainsPtr
    ptr::Ptr{double}
end

you can’t do

function f(n=10)
    v = rand(n)
    a = ContainsPtr(pointer(v))
    ccall(...., (Ref{ContainsPtr}, ), a)
end

as v could be GCed anywhere after the creation of pointer(v) (remember: stuff may be rearranged or elided the compiler at will. and don’t think in terms of: on this line v is still alive… :stuck_out_tongue:) You should explicitely GC.@protect v whenever it’s pointer might be in use:

function g(n=10)
    v = rand(n)
    @GC.protect v begin
        a = ContainsPtr(pointer(v))
        ccall(...., (Ref{ContainsPtr}, ), a)
    end
end

thus GC is notified that v may NOT be released before @GC.protect block has ended. But as noticed by @yuyichao v needs to have value at the time you enter the block (I’m actually surprised that your block of yours even parsed?!). It’s value hold by v you’re protecting, not the variable.


As for the second issue:
when ccalling there is a chain of things happening to a variable. consider the example ccall(...., (Ref{ContainsPtr}, ), a), where a isa ContainsPtr.
while protecting a from GC, ccall pipes it through

ara = Base.cconvert(Ref{ContainsPtr}, a)pa = Base.unsafe_convert(Ref{ContainsPtr}, ra)

when ccall(...., (Ptr{ContainsPtr}, ), a) is executed the same chains takes place, but with Ptr{ContainsPtr} instead of Ref{ContainsPtr}.

Rough explanation is as follows:

  • the first function on the chain should convert a to something that in julia has the same memory layout as ContainsPtr (here it’s already in form), and
  • the second ‘strips’ julia’s header (2bytes? if i remember correctly?) attached to every* julia struct, by shifting the pointer appropriately. (@yuyichao please correct me here if I’m wrong).

But the point is that You should provide a version of the first function specialized to your type. Grabbing the first example from Cairo.jl:

function stroke(ctx::CairoContext)
    save(ctx)
    # use uniform scale for stroking
    reset_transform(ctx)
    ccall((:cairo_stroke, libcairo), Nothing, (Ptr{Nothing},), ctx.ptr)
    restore(ctx)
end

here we have ctx::CairoContext but want to pass Ptr{Nothing} to ccall. The solution is implement

`Base.cconvert(::Type{Ptr{Nothing}}, ctx::CairoContext`) = ctx.ptr

and rewrite stroke as

function stroke(ctx::CairoContext)
    save(ctx)
    # use uniform scale for stroking
    reset_transform(ctx)
    ccall((:cairo_stroke, libcairo), Nothing, (Ptr{Nothing},), ctx)
    restore(ctx)
end

Now you don’t have to worry and analyze the function to see if it is safe to access ctx.ptr etc. (and that ctx isn’t GCed in the meantime), as every operation on pointers is encapsulated within ccall.

this example might not be of value but consider this (contrived) example:

mutable struct FirstOrLast
   first::Vector{Float64}
   last::Vector{Float64}
   ptr::Ptr{Cdouble}
   function FirstOrLast(a,b,pointtofirst=true)
      fl = new(a,b)
      fl.ptr = pointtofirst ? pointer(fl.a) : pointer(fl.b)
      return fl
   end
end

here we keep in ptr a pointer to one of the arrays.
let fl = FirstOrLast(rand(5), rand(10)) by the time you pass ccall(..., (Ptr{Cdouble}, ) fl.ptr) fl might have already been GCed and none of fl.first nor fl.last are there, so the pointer passed to ccall is invalid ans you get a beautiful segfault; however if you define

Base.cconvert(::Type{Ptr{Cdouble}}, fl::FirstOrLast) = fl.ptr

you’re safe.


Which brings me to the point that I need to go back and rewrite some of my code… :stuck_out_tongue:

EDIT: corrected the reversed order of cconvert and unsafe_convert from ccall, as noted by @mzaffalon

2 Likes

Thanks for the nice writeup.

That the pointer handling in Cairo.jl is wrong is really a revelation, as for (very) long time no part of the julia compiler complained about it. No error message, no warning (btw: if you look very closely into git you see the original code for the ccalls written and never touched again).

I can follow your example in the second block, still i’d like an answer, why GC decides to run a finalizer on ‘surf’ in my example before the call finish(surf) is reached. The example is test/runtests.jl and run in my session as include(“test/runtests.jl”). All here should be global.

1 Like

I’m not sure If I can answer your question, but one possible explanation is that finish(surf) gets inlined and all references to surf are gone so GC excercies its freedom of collecting it. E.g.

function myunsafe_load(fl::FirstOrLast, n::Integer)
    return unsafe_load(fl.ptr, n)
end

function get_nth(fl::FirstOrLast, n::Integer)
    (...)
    w = myunsafe_load(fl, n)
    (...)
    return w
end

if myunsafe_load gets inlined compiler will see w = unsafe_load(fl.ptr, n) and may choose to GC fl before and just substitute the actual pointer into unsafe_load (which no longer will be valid).

I always thought that @testset introduces a scope?

I agree. Although it’s hard to find in the manual.

btw: I’ve created a Cairo that works with the Base.unsafe_convert.
And i get the same failures, as the finalizer seems to be called before finish.

Isn’t it the other way around? acconvertunsafe_convert? From the manual:

unsafe_convert(T, x)

Convert x to a C argument of type T where the input x must be the return value of cconvert(T, ...) .

indeed, you’re right, I’ve corrected the original answer, thanks!