Finalizer only works with mutable structs?

What reference are you storing though? An immutable object doesn’t have a reference (it may have many). You can always use Ref(x) to make a strongly-typed stable reference however (e.g. let x = Ref(x); GC.@preserve x begin; ccall(..., (Ptr{Cvoid},), x); ...; otheruse(x); end) and also attach finalize to it and so on. It’s nothing particularly special though, just a mutable struct itself. (there’s also a lot of similar options, such as Ref{Any}(x) and (Ref{T},) just depending on what’s appropriate and convenient).

I didn’t mean reference literally, as in Ref. It’s actually a mutable struct now, which I store in a Dict where I know it will survive all future uses.

Unfortunately, I don’t think I can use the GC.@preserve block in my case, as the other uses (besides the ccall are not in the same scope at all.

If there is interest in this use case, I can mock up some simplified C header with Julia wrapper.

EDIT: I found that this case is already handled well by the blog post on C callbacks, linked from the Julia docs.

You almost never want to call pointer_to_objref yourself.
(If you are calling it, you are likely doing it wrong for mutable types as well)

OK, suppose I have a C function that accepts array of pointers to structs:

typedef struct
{
 ...
} _M;

typedef _M *m_t;

void
test_struct(m_t *str_arr)
{
   ...
}

How do I call it from Julia? I verified that

mutable struct M
    ...
end

m = Array{M}(undef,N)
m_p = Array{Ptr{M}}(undef,N)
for n = 1:N
    m[n] = M()
    m_p[n] = pointer_from_objref(m[n])
end

ccall((:test_struct, :mylib), Cvoid, (Ref{Ptr{M}},), m_p)

works. Is that not the right way?

No it’s not. For anything GC related, you can never verify that something work by running it. NEVER. This is due to the nature of GC running very infrequenty and things can easily appear to work since neither the compiler or the GC is actively trying to exploit the wrong code. Both of them just don’t care about what you have. They are just trying to reduce the amount of work and would not by default do the extra work required to make bad code crash.

For your case, there’s no reason m is kept alive at any point and m_p[n] is allowed to be completely garbage pointers.

In general, you should call cconvert and unsafe_convert to populate your m and m_p and then keep m alive during the ccall with GC.@preserve. i.e. m[n] = Base.cconvert(Ptr{M}, M()); m_p[n] = Base.unsafe_convert(Ptr{M}, m[n]).

In this case, if you know M is mutable, you can indeed reimplement cconvert and unsafe_convert yourself.That is indeed what you did and you are just missing the code to keep m alive.

That’s also why I said “almost”. It’s basically for the case that you know what cconvert and unsafe_convert would do and prefer to reimplement them yourself for one reason or another. You generally only care about this if you care about the (type of the) return value of cconvert (which you kind of do in this case).

1 Like

Thank you, I understand about GC issue, but not the rest.
The manual says

Neither convert nor cconvert should take a Julia object and turn it into a Ptr .
However the call

 m[n] = Base.cconvert(Ptr{M}, M()); 

does just that?

Secondly, the unsafe_convert converts Ptr{M} to Ptr{M} ?

Structure M has pointer to memory that must be initialized by a C library, and from the reading of discussions here I understood, perhaps incorrectly, that the only way to do that is to make it mutable.
(M is really BigInt)

No. m is a vector of M and cconvert did not return a pointer.

No, it converts M to Ptr{M}, essentially by calling pointer_from_objref.

No. You can get C initialized struct with pointer field however you want. You just can’t transfer the ownership of that pointer to julia and use finalizer on the object to manage that pointer without having a mutable object to own that memory.

No. m is a vector of M and cconvert did not return a pointer.

Sorry, should have read the source. For Type{<:Ptr} convert is noop, so that was what you meant by " if you know M is mutable, you can indeed reimplement cconvert and unsafe_convert yourself.That is indeed what you did and you are just missing the code to keep m alive." Although, where does the mutability enter into it?

You can get C initialized struct with pointer field however you want.

In a constructor of an immutable type I should get initialized memory somehow. What can I pass to a C initializer function other than a pointer to a mutable type that mirrors the definition of an immutable type?

Because I shouldn’ve said Ref{M} instead of Ptr{M} = = …
And then you can see that for immutable type cconvert isn’t a no-op… = = …

You will not be construct that object in julia, you let C fill in the memory and you load that object in julia. The memory can either be julia (managed) memory or C (unmanaged) memory and it can either be allocated in julia or in C.

cconvert(Ref{M}, M()) is not a no-op for a mutable M as well, so it is not clear to me what exactly you have in mind here

You can get C initialized struct with pointer field however you want.

OK, then in an immutable constructor I may do something like

m = Array{UInt8}(undef,sizeof(M))
mp=Base.unsafe_convert(Ref{M},m)
ccall((:struct_init, :mylib), Cvoid, (Ref{M},), mp)
unsafe_load(mp,1)

which does seem to get correctly initialized memory, but contradicts documentation, by not passing to unsafe_convert the output of cconvert

The implementations of the two convertion functions together is effectively the same as a no-op plus a pointer_from_objref.

Just a Ref would work. If anything, you should use Array{M}

Err, which part contradicts the document? The document didn’t say what would NOT work, it only says what would work. What you have is indeed not what the document says should work but that just mean it may not be well defined code. It doesn’t mean the code would not work or would not appear to work.

In your code, you are still not garanteed to get a valid pointer out of m, i.e. mp could be garbage since you still failed to keep m alive.

According to the document and using each conversion functions as well as the unsafe_load explicitly, you need,

m = Ref{M}()
mc = Base.cconvert(Ref{M}, m) # Converting to `Ptr{M}` is fine here to as long as it agrees with below.
mp = Base.unsafe_convert(Ref{M}, mc)
GC.@preserve m begin
    ccall(....., mp)
    return unsafe_load(mp)
end

Now you don’t have to do the unsafe_load manually since you have access to the julia object, you can just do m[] whether m is a RefValue{M} or Vector{M}.

You also don’t have to do all the conversion explicitly, the code with unsafe_/cconvert and GC.@preserve above is basically an expansion of ccall(..., m) (not how things are implemented but effectively the same) so you ony need m = Ref{M}(); ccall(..., m); return m[]

2 Likes

Thank you for this excellent explanation. So

m = Ref{M}()

“This type is guaranteed to point to valid, Julia-allocated memory of the correct type.” And that must hold for immutable types, as well, since documentation does not say otherwise.

Is it possible to manually trigger memory corruption without GC.preserve ? I tried
GC.gc(true)
but the code still runs

No. It’s just a normal type. What you get by converting it to a pointer of the correct pointer/ref type is a pointer to a memory that can hold the correct type. The pointer you get may not have the correct type (it may not have the tag or the correct one and calling jl_typeof on it in C will not give you the “correct” answer).

Not in general. The compiler also may or may not decide tell the GC that certain objects aren’t used by your code anymore.

“It” here refers to the result of calling Ref{M}() ?
So looking at the source, what it does is just put M inside the struct and create that struct without calling constructor for M:

RefValue{T}() where {T} = new()

that new() call applies to RefValue struct, so I guess its implementation just reserves sizeof(M) bytes within RefValue instance and claims to Julia that it is M, without actually writing anything into that memory. Correct? Then the documentation could be clearer on that point, I think.

Yes.

AFAICT, what you said was just a somewhat ok but somewhat inaccurate description of creating an type with a single field M.

Also, I’m not sure what you know and what are you trying to understand now. What you’ve said so far has a mix of different levels (the julia code level and implementation level). Without knowing what you want and what you know, it’s very hard to judge exactly what you meant and what you want to get out of it/what you want to confirm.

For example, both of “Julia-allocated memory of the correct type” and “claims to Julia that it is M” are very vague. They can both be correct or incorrect based on how you interpret them but the correct interpretation might not be how you interpreted them.

Mostly I am trying to understand the memory handling from the point of view of mainly C programmer, who studied Julia for about a week.

So in the simplest version

m=Ref{M}()
ccall(...,(Ref{M}),m)
return m[]

it’s the usage of m after ccall that protects m from GC and without it the ccall might fail?

I think it’s the best to understand what julia garantees in terms of memory manangement and trying to separate it out from the implementation…

There are multiple layers happening at the same time.

You need to first understand that in julia objects are not related to addresses so until you are converting something to a C pointer, it doesn’t make sense to talk about addresses (Julia will find somewhere to store your data and you can learn the internal to figure out how it’ll do it but that has no significance and isn’t something you can rely one).

Then you can learn what basic pointer conversions there are. This includes pointer and pointer_from_objref and what’s the object memory layout they expose to you. This is also where the user-facing API’s (cconvert and unsafe_convert) lies. Basically cconvert and unsafe_convert on Ptr and Ref are wrappers around pointer and pointer_from_objref in one way or another. You’ll also understand here the ownership of the pointer you get. For correct implementation of the functions, the result of pointer and pointer_from_objref belongs to the argument. This is not always possible (immutable types basically can’t) so generically, cconvert returns an object that’s the owner of the pointer returned by calling unsafe_convert on it.

Finally, you need to know what liveness guanrantee julia provides. There are basically 3 in pure julia, GC.@preserve, ccall and global variable. As long as the owner of the memory is guaranteed rooted/preserved by one of the three, you can use the pointer.
As an example, if you have a = Vector{Float64}(...); GC.@preserve a begin p = pointer(a); <...> end then within <...> you can use p however you want and it is guaranteed to be correspond to the content of a. Similarly, if you have ccall(...., a) and if the unsafe_convert and cconvert are defined correctly for a, then there won’t be any problems. (and also similar for global varables)

No. I believe the ccall document described this in detail (also see above), ccall guarantees that the return value of cconvert is live throught the whole call. The m after the call NEVER has any significance in determining the lifetime as far as the semantics is concerned. In reality it could, which is why if you do ccall(..., pointer_from_objref(m)); return m[] it could actually work even though it’s completely invalid code.

7 Likes

The m after the call NEVER has any significance in determining the lifetime

How so? The GC can’t eliminate the objects while they are still being used, can it?

What “being in use” means is a bit fuzzy. The compiler can reorder things, pick apart things, remove things from ever existing, etc however it wants. The only correct way to influence lifetime is the ways that have been stated.

1 Like

The GC doesn’t care a little bit about object. The GC cares about allocations.
So yes, the GC cannot eliminate an allocation if it’s still being used. So if m was allocated before the ccall and used after the ccall and if the use after the ccall didn’t get eliminated the GC won’t free the allocation.

However, an allocation is completely different from an object. The compiler takes care of that and in this case it doesn’t have to satisfy basically any of the conditions above. The compiler can decide to not allocate m before the ccall, or not at all, it can decide the use of m after the ccall is dead and eliminate it. There are many different possibilities and most of them (including the separation of the work between the compiler and the GC) are just implementation detail of the language. That’s why I said you should understand each layer separately and stick to the actual guarantee.

It’s the same deal as UB in C. If the compiler didn’t think it’s worth messing with you, you can write code with UB that appears to work. Once in a while though, the compiler will find a smart way to surprise you.

3 Likes

Ok, to clarify. With complete code, consider

typedef struct MS {int c;} M;

extern void foo(M** arg){
    (*arg)->c = 1;
    return;
}

compiled via

$ clang -shared -fpic foo.c -o foolib.so

Now let’s do the julia side:

julia> mutable struct M
       x::Cint
       end

julia> arr=[M(2) for i=1:3]
3-element Array{M,1}:
 M(2)
 M(2)
 M(2)

julia> h(arr, i) = ccall((:foo, "./foolib.so"), Cvoid, (Ptr{Ptr{M}},), pointer(arr, i));

julia> h(arr, 1); arr
3-element Array{M,1}:
 M(1)
 M(2)
 M(2)

That works as expected, and is valid when called from the REPL, but is invalid when called from a compiled context, due to gc issues: No one keeps arr alive. We can fix this by

julia> h_gc(arr, i) = GC.@preserve arr ccall((:foo, "./foolib.so"), Cvoid, (Ptr{Ptr{M}},), pointer(arr, i));

Both compile down to identical @code_native, but behave different when inlined.

If we were too lazy to deal with the entire GC.@preserve dance, we could also

@noinline h_ni(arr, i) = ccall((:foo, "./foolib.so"), Cvoid, (Ptr{Ptr{M}},), pointer(arr, i));

right @yuyichao?

I.e. just put a @noinline function barrier outside of contexts where we take pointers and trust that the calling context keeps our stuff alive? Or is the compiler allowed to IPO through @noinline function barriers? (besides of caller knowing llvm attributes of the callee, which should be ok)

For other people reading at home, the @code_native h(arr, 1) is a mess on 1.2.0, which is fixed in https://github.com/JuliaLang/julia/commit/350f514a91ab29b74cb2f81d0a7a134b8ac283bb
. (we had Base.pointerBase.elsizeCore.sizeof(Ptr), which didn’t constant-fold and should never ever have been called; now it is Core.sizeof(Ptr{Cvoid}) which is a concrete type and inlines properly).