On the garbage collection

bicycle1885 · March 7, 2020, 7:55pm

Hi, I still feel some difficulty in understanding garbage collection. My understanding is that Julia has the mark-and-sweep garbage collector, which marks a set of GC-rooted objects and other objects traceable (or reachable) from them, and then sweeps unmarked objects. In addition, the garbage collector is accurate, meaning only pointers to valid Julia objects are GC-rooted and traced in the marking process. I think this property makes the GC root placement more complicated to understand, and I’m not sure which objects exactly are GC-rooted in Julia.

For example, I have no idea why, in the follwing code taken from the standard library, we need to GC.@preserve the object referenced by x:

# base/deepcopy.jl
function deepcopy_internal(x::String, stackdict::IdDict)
    if haskey(stackdict, x)
        return stackdict[x]
    end
    y = GC.@preserve x unsafe_string(pointer(x), sizeof(x))
    stackdict[x] = y
    return y
end

Isn’t the string object referenced by x GC-rooted? More generally, aren’t objects referenced by local variables in the call stack always GC-rooted? I would like to know how the GC root placement pass works and which objects are GC-rooted. I can find some description on this topic in a devdocs section, but it is not very helpful to me because I do not share the background knowledge required to understand its details.

Thank you in advance.

yuyichao · March 7, 2020, 8:24pm

You are not asking the right questions. GC, as a concept, doesn’t really exist in julia code. It only exist in the lower level implementation.

What exist in julia code, instead, is the guarantee that you don’t need to worry about anything implementation related (GC included) if you are just writing normal julia code and a set of guarantees that allow you to write unsafe code correctly.

So, with that in mind,

None of these properties has anything to do with understanding how to interact with the GC. None of these has anything to do with why GC.@preserve is needed in any cases. That’s because all the properties are implementation detail about the GC and either interacting with the GC or GC.@preserve are GC API for julia code. There’s basically no one-to-one correspondence between the two. Of course the two are related and for a particular version you may say certain way to deal with the GC is needed because otherwise some particular optimization is allowed to break your code. However, it is important to remember that going through that path will only cause you to rely on implementation detail.

Yes, no, it depends. If the compiler think it is necessary then it’ll be rooted. If the compiler think it’s not then it won’t. For all what you care about you don’t need to do anything if you simply want to use a local variable but that by no way mean a local variable is always GC-rooted. It doesn’t even imply that the concept of GC-rooted even make any sense at all for a particular object. Bottom line, again, is that the implementation will make sure the object is valid when you use it, which may or may not have anything to do with GC-rooting. It’ll just guarantee that the object is valid when you use it.

That doc has nothing to do with julia code. It’s all about implementation and you won’t be able to find anything from there regarding how to write correct julia code.

Now finally,

The use of x does not require any code to make sure it’s valid. However, since pointer(x) is just a number and can be passed around and used in any way the user wishes, there’s no way julia can keep track of how it is used in general and when is the user done using it. That’s why GC.@preserve is used here. It makes sure that pointer(x) is valid before the enclosing expression finishes by making sure all memory related to x is valid.

Of course in this case it is pretty obvious how long x needs to be alive for and a smart enough analyser can figure that out pretty easily. Given the code following it the compiler might now even bother invalidating the root in any case. However, none of such rules can be made stable/reliable so the you must use one of the few guaranteed way to ensure the validity of the memory you are using no matter the context.

It’s also worth mentioning that the requirement to preserve objects correctly may not have anything to do with the GC. It should be possible to construct cases where the code will return garbage value/crash without correct GC.@preserve or equivalent even if you’ve disabled the GC. In that sense GC.@preserve may not be the best name. It’s really about writing unsafe code rather than GC…

Note that I’m assuming you want to know how to write code correctly rather than wanting to know all the detailed implementation of the compiler. Even if you do want to know about the internals, it is a requirement to understand the guarantee to julia code and how to write it correctly before trying to understand why the implementation is the way it is. Reading related devdocs will not help you in that regard.

bicycle1885 · March 7, 2020, 9:21pm

Thank you for your explanation. But I’m still a little bit confused, possibly because I cannot distinguish safe normal code from unsafe unusual one. I quickly grepped the standard library and discovered several cases that look very similar to the example I posted above but without GC.@preserve:

# base/hashing2.jl
function hash(s::Union{String,SubString{String}}, h::UInt)
    h += memhash_seed
    # note: use pointer(s) here (see #6058).
    ccall(memhash, UInt, (Ptr{UInt8}, Csize_t, UInt32), pointer(s), sizeof(s), h % UInt32) + h
end

# base/bitset.jl
function ==(s1::BitSet, s2::BitSet)
    # Swap so s1 has always the smallest offset
    if s1.offset > s2.offset
        s1, s2 = s2, s1
    end
    a1 = s1.bits
    a2 = s2.bits
    b1, b2 = s1.offset, s2.offset
    l1, l2 = length(a1), length(a2)
    e1 = l1+b1
    overlap0 = max(0, e1 - b2)
    included = overlap0 >= l2  # whether a2's indices are included in a1's
    overlap  = included ? l2 : overlap0

    # Ensure non-overlap chunks are zero (unlikely)
    _check0(a1, 1, l1-overlap0) || return false
    if included
        _check0(a1, b2-b1+l2+1, l1) || return false
    else
        _check0(a2, 1+overlap, l2) || return false
    end

    # compare overlap values
    if overlap > 0
        _memcmp(pointer(a1, b2-b1+1), pointer(a2), overlap<<3) == 0 || return false
    end

    return true
end

In both of the methods, the pointer of a heap-allocated object is passed to other function but those objects are not GC.@preseved. Are there any rules that I don’t know?

yuyichao · March 7, 2020, 9:39pm

No you are not missing anything. You just discovered the huge pile of invalid code in Base that I give up fixing at some point…

bicycle1885 · March 8, 2020, 5:55am

Oh, really. I expected that these were safe because memhash and memcmp do not allocate and thus garbage collection will never be triggered while executing these operations. Why are these method definitions invalid? Perhaps other preemptive threads may trigger garbage collection?

yuyichao · March 8, 2020, 6:08am

No, as I said, it’s wrong to start thinking that way. It is not guaranteed to work so it is wrong. It’s as simple as that.

Also, even though it doesn’t really apply here yet, the GC has little to do with the need to preserve an object (well, or only partially). It’s all about the compiler. The GC can be absolutely chill and not doing anything for this kind of incorrect code to crash.

Now if you are talking about current implementation. Well, I don’t think running that code with optimization on will likely trigger any real issue, since the compiler has little interest to mess with it (any transformation that causes crash here will likely decrease performance). However, its wrong to say a function do no allocate. For one, allocation (in GC sense) doesn’t exist as a concept either. Creation of new object does but it’s an orthogonal concept from allocation, i.e. GC allocation can happen without new object being created and new object can be created without allocation. Also, no where does julia say any code have to be optimized to the way you want and AFAICT such thing can actually happen in debugger if not in some low optimization level mode. (it is unlikely to trigger any real issue either in the debugger but again that’s not the point)

In another word, this is the same deal as any undefined behavior. You can hope the compiler to not mess with you and in most case your hope will look correct since the compiler isn’t looking for the best way to screw you up. However, at some point, when you stopped watching, bad things can happen out of nowhere.

bicycle1885 · March 8, 2020, 6:39am

I see. I think I understand your point. So, then, I would like to know the guideline of using GC.@preserve because Julia has neither formal specification nor warnings from compiler of abuse. The current docstring of GC.@preserve does not make sense without detailed knowledge of when objects may be garbage collected. Do we need to GC.@preserve an object whenever we handle a pointer derived from it? Any other cases we need GC.@preserve?

help?> GC.@preserve
  GC.@preserve x1 x2 ... xn expr

  Temporarily protect the given objects from being garbage collected, even if they
  would otherwise be unreferenced.

  The last argument is the expression during which the object(s) will be preserved.
  The previous arguments are the objects to preserve.

yuyichao · March 8, 2020, 7:04am

It’s needed whenever you are using manipulating what belong to an object without using an reference to that object.

The only way this can cause crash is obviously when you have unsafe code (mostly pointers) since you aren’t supposed to get crashes in any other way. (If you do it’s not your but bug julia bug). However, you may also need it, boardly speaking if you extend the definition of what object owns what, as shown below.

In the most abstract sense, everything you are operating on is owned by some object and as long as you can figure this ownership out then you’ll have no problem writing correct code. Usually the owner is the object you are manipulating but when it isn’t then that’s when you need to be careful. The most obvious example is certainly pointers. When you pass them around, unless you are just using them as random bit patterns, the memory operetion you do on the pointer belongs to whereever you get that pointer from and you need to preserve that object using one of the few mechanisms.

Now, apart from pointers, you can certainly define you own ownership and assign one non-pointer object to be owned by another object, i.e. I can have a = A(); b = B() and just say b actually belongs to a as the “contract” in my API. However, for this to be of any significance/necesity the freeing/destruction of the owner must have an effect on the object being owned. Apart from the memory itself this pretty much only happen with finalizers. As an example if you have b = Ref{Int}(1); a = Ref{typeof(b)}(b) and add an finalizer to a that clears b to 0. In order for b to be valid (non-zero) you now need to make sure a is preserved during the operation. (Note that this case is also very subjective and is entirely up to the creater of the API to determine the ownership. b having a value of 0 might be entirely valid for example and is merely keeping count of something)

Another way of seeing this is that we have normal memory that is managed by the compiler/GC and we have object/memory that is managed by the finalizer. The normal memory always has an object attached as owner and the finalizer is also always registered to an owner object. You just need to figure out if the validity of an object/memory/things is managed as normal object memory or by a finalizer, and then you can figure out the owner easily from there.

In short, you need to figure out the owner of whatever you are operating on. It is pretty much part of the API what the owner is and an object that is now owned by itself should be explicited documented one way or another. Pretty much the only two ways this can happen is from derived pointer and finalizers so you should get a pretty good guess without looking.

bicycle1885 · March 9, 2020, 11:16am

Thank you very much! I believe I understand the concept of object management. In order to make your last example more concrete, I’d like to post a short snippet with commentary.

let
    # b is implicitly owned by a, but Julia doesn't know the fact.  Therefore,
    # b may be invalidated by the finalizer of a while you're operating on b.
    b = Ref{Int}(1)
    a = Ref{typeof(b)}(b)
    finalizer(x -> (x[][] = 0), a)
    # No longer safe to operate on b (b may be zeroed!).
    @show b  # So, this is unsafe!
end

let
    # Same above, but with explicit object preservation.
    b = Ref{Int}(1)
    a = Ref{typeof(b)}(b)
    GC.@preserve a begin
        finalizer(x -> (x[][] = 0), a)
        # It is safe to operate on b (b will never be zeroed).
        @show b  # So, this is fine!
        # The safe zone ends here.
    end
    # No longer safe to operate on b (b may be zeroed!).
end

Tamas_Papp · March 9, 2020, 11:36am

Please consider making a PR to the docs based on the discussion in this topic.

c42f · March 11, 2020, 6:46am

I find this example a little confusing, maybe because “safe” here has a somewhat unusual meaning (it could be good to elaborate on that if this is going into documentation).

I think you can get a little more symmetry by moving the finalizer out of the preserve block in the second part:

let
    # Same above, but with explicit object preservation.
    b = Ref{Int}(1)
    a = Ref{typeof(b)}(b)
    finalizer(x -> (x[][] = 0), a)
    GC.@preserve a begin
        # It is safe to operate on b (b will never be zeroed).
        @show b  # So, this is fine!
        # The safe zone ends here.
    end
    # No longer safe to operate on b (b may be zeroed!).
end

A more straight forward and directly relevant example could be conversion to a pointer and a subsequent load. Though this may not be what you’re trying to demonstrate.

let
    a = Ref{Int}(1)
    # The following conversion itself is safe, as Julia knows about our
    # reference to `a` in this expression:
    p = Base.unsafe_convert(Ptr{Int}, a)
    # Julia is now unaware that memory pointed to by `p` is logically owned by `a`
    # So in any "use" (ie, dereference) of `p` we must preserve `a`
    GC.@preserve a unsafe_load(p)
end

bicycle1885 · March 12, 2020, 3:46pm

c42f:

I think you can get a little more symmetry by moving the finalizer out of the preserve block in the second part:

let
    # Same above, but with explicit object preservation.
    b = Ref{Int}(1)
    a = Ref{typeof(b)}(b)
    finalizer(x -> (x[][] = 0), a)
    GC.@preserve a begin
        # It is safe to operate on b (b will never be zeroed).
        @show b  # So, this is fine!
        # The safe zone ends here.
    end
    # No longer safe to operate on b (b may be zeroed!).
end

I’m a little bit worrying about this because a is not preserved after its finalizer is registered. Julia may insert some code between finalizer and GC.@preserve. But more likely situation is that a careless programmer inserts some code and that invalidates the safety. So I think a should be preserved before its finalizer is registered. I’m not sure if this is safe.

yuyichao · March 12, 2020, 4:19pm

No there’s nothing to worry about there. a is guaranteed to be valid within that block as far as you can tell. That’s the only thing that matters here.

Note though that this does not mean a is valid anywhere BEFORE the GC.@preserve block.

c42f · March 13, 2020, 3:55am

An excellent point! I would slightly restate this as: “no implicit use of a outside the GC.@preserve is safe”.

To expand a bit: compilers are required to act as if the program statements execute in order. However, when the compiler can prove to itself that order doesn’t matter, it’s free to reorder things and will do so if it thinks that’s more efficient. So if you don’t tell the compiler that a statement depends on a being live it’s free to move that statement to occur after the GC.@preserve.

It gets even more subtle because very similar considerations extend to the hardware: if the Julia compiler generates native code which is apparently in program order (relative to the source text), but doesn’t layer this code with the necessary semantics the hardware itself may reorder things at runtime. Again, this happens only if the hardware can prove that the ordering doesn’t matter. This can be a problem especially for memory ordering in multithreaded programs which don’t use atomic variables correctly (see the discussion at @runtime_ccall thread safety · Issue #106 · JuliaGPU/CUDAapi.jl · GitHub)

When the compiler (or hardware) proves it’s allowed to do an “invalid transformation” in the examples above, this is an error in the source program which provided incorrect assumptions leading to the proof.

c42f · March 13, 2020, 4:03am

It’s safe because the GC.@preserve block can only run while holding a live, no matter what other transformations the compiler may choose to do. (And the line using finalizer references a directly so that’s clearly ok.)

If the compiler wants to insert some other code between the finalizer and the GC.@preserve that’s also ok; it can only do so if that would respect the invariants you’ve told it at the source level.

yuyichao · March 13, 2020, 4:14am

These implemenation detail based explaination are fine as long as you don’t treat it as an exhausted list. The compiler can always find ways you haven’t think of to trick you. OTOH, they are always good as examples of such surprising behaviors.

On the note of multiple thread, I’ll say if you have a used of a that is synchronized to happen inside the GC.@preserve block it should be safe too. I believe this is currently the case, it shouldn’t be hard to maintain, and it’ll be very hard to reason about the program is this guarantee is not maintained. It should also be fine from the other viewpoint (i.e. if you send a value to be preserved synchronously on another thread). It gets tricky there but I think it’s safe (and I believe it needs to be safe).

rryi · March 15, 2020, 7:24am

Thanks for the elaborated discussion, it clarified a lot to me, but some questions remain.

I understood (simplified): using pointers should always be protected by a surrounding GC.@preserve.
Am I right that this code from Julia 1.0.5, iobuffer.jl line 462 ff is an incorrect use, because the pointer p is defined outside the @preserve block instead of inside it?

function occursin(delim::UInt8, buf::IOBuffer)
    p = pointer(buf.data, buf.ptr)
    q = GC.@preserve buf ccall(:memchr,Ptr{UInt8},(Ptr{UInt8},Int32,Csize_t),p,delim,bytesavailable(buf))
    return q != C_NULL
end

My suggestion for a correction would be simply

function occursin(delim::UInt8, buf::IOBuffer)
    q = GC.@preserve buf ccall(:memchr,Ptr{UInt8},(Ptr{UInt8},Int32,Csize_t),pointer(buf.data, buf.ptr),delim,bytesavailable(buf))
    return q != C_NULL
end

IOBuffer allocates its working memory as a String and wraps it in a UInt8 vector:

# allocate Vector{UInt8}s for IOBuffer storage that can efficiently become Strings
StringVector(n::Integer) = unsafe_wrap(Vector{UInt8}, _string_n(n))

Using the String instance directly in IOBuffer (instead of the UInt8 vector wrapper) would
spare one indirection. However, all String code unit accesses need a @preserve guard, see
codeunit definition in string.jl:

@inline function codeunit(s::String, i::Integer)
    @boundscheck checkbounds(s, i)
    GC.@preserve s unsafe_load(pointer(s, i))
end

If we compare the current IOBuffer imlementation with the possible alternative to use
the String instance directly with @preserve wrapped pointer accesses:
does wrapping in an UInt8 vector yields better performance, despite the added indirection,
because @preserve is avoided and compiler can do better optimizations, or is the UInt8 vector wrap
due to better code reuse with GenericIOBuffer and the rich array API for the price of a small performance penalty, because the internal array element access code has to do similar things @preserve does?

yuyichao · March 15, 2020, 11:55am

No, the use is perfectly fine.

No.

Sukera · March 15, 2020, 12:14pm

If I understand correctly, the @preserve preserves buf, not the pointer. This is necessary to keep the memory the pointer is pointing at valid while in the ccall.

yuyichao · March 15, 2020, 1:14pm

More precisely, the @perserve is for the memory the pointer points to (which belongs to buf) rather than the pointer itself. It’s perfectly legal to do all sort of operations on the pointer, say generating it, adding an offset, etc, and as long as you are not using the result in any way that relies on the memory it points to you don’t need to do that inside a GC.@preserve block.

Now GC.@perserve has little to no performance impact for any normal code, especially if you are extending it backward rather than forward (i.e. including more code before the important one) so either is fine. It’s more of an issue of which one is easer to write than anything else.

Topic		Replies	Views
Avoid GC freeing Ptr{Nothing} General Usage	25	2105	May 29, 2020
Finalizer only works with mutable structs? General Usage	74	3540	December 21, 2019
Does `GC.@preserve a b` preserves the location of `a` and `b` General Usage	11	1763	May 28, 2020
C struct garbage collection not run frequently enough General Usage garbage-collection , mutable-structure , c , gc	28	443	July 14, 2024
Does `GC.@preserve` also prevent instruction re-ordering? General Usage	7	562	March 1, 2019

On the garbage collection

Related topics