I have large structs with a lot of fields (think of multiple 100s) which I need to communicate with a C library. Julia creates the instances of a struct (memory allocation) but C fills in the values. Keeping its memory undefined on the Julia side should work, but it doesn’t as it turns out that the involved C code does not care about the initial memory content as long as it’s all zeros ;-).
Luckily, I can auto-generate the structs in Julia from a specification file quite easily. However, I am struggling with the initialization of a struct’s instance. The best way of constructing an all-zero instance I currently know of is to call each field’s constructor (using a support macro) and adding appropriate constructors where necessary which is some work with all the different types involved. Additionally, the necessary field types’ constructors sometimes do not really make a lot of sense on their own.
Is there an easier way to construct an all-zero instance of a struct (known to Julia’s GC)?
My thoughts up so far:
I could call calloc directly instead of calling a constructor, but then the memory would not be known to Julia’s garbage collector and I would need to keep track of memory management myself which I’d rather avoid.
But if Julia offers a similar call and would inform the GC, that would be nice.
Additionally, I am thinking of something like zeros, but for a struct instead of an Array. zeros calls fill!(array, 0), which for byte arrays calls memset. I think I could implement an analogous zeros method for a struct called in an inner constructor using incomplete initialization.
But maybe such a functionality exists already?
You could define your big struct in the following manner, which will give you a constructor that does not involve passing values for fields:
@kwdef struct MyBigStruct
field1::Int=0
field2::Int=0
field3::Int=0
end
someinstance = MyBigStruct()
# will produce: MyBigStruct(0, 0, 0)
However, maybe a better approach might be to use something in the flavor of the following function:
function initzeros(::Type{T}) where {T}
fvals = fieldtypes(T)
T(zero.(fvals)...)
end
struct BigOne
field1::Int
field2::Float64
field3::Rational
.... lots of additional fields
field1000::Int8
end
niceinstance = initzeros(BigOne)
# will output your struct instance with all fields `zero` (in a type-safe manner).
Thanks for that idea. It would simplify the creation indeed a bit, because I’d need to touch each field only once (and not effectively a second time for the constructor call).
However, I think the major problem would persist that either every type must support the same constructor (e.g. accepting a 0), which is quiet unnatural for some field types (think of a String or an immutable struct), or I’d need to maintain a list which constructor to call for which type. Both is not especially elegant.
Interesting, thanks, that’s definitely an improvement of my current status. Instead of implementing unnatural constructors for the fields’ types, I’d only need to implement a much more natural zero method for the types which lack one.
I am still hoping for a “one knob” solution which does not need to add multiple methods. Of course, I could try to implement zeros for type Any based on some recursive approach, but that’s probably hard to get right in general.
struct B
field1::Float64
field2::Int
field3::Rational
field4::String
B() = new()
end
initialized = B()
# will produce: B(0.0, 0, #undef, #undef)
You no longer need to concern yourself about defining additional zero methods for all kinds of types.
My understanding is that the undef approach is compatible with what you wrote here (and it turns out that on Julia’s side is easier to go with undef than defining zero methods for additional types).
It will not always produce that. Rational is missing a type parameter to be concrete, then it will act similarly to fields 1 and 2. For bitstypes you’ll get random values (which just might happen to be 0 more often than other values because that’s a common state for memory to be in):
Update: Sorry, just saw @jules comment, so this is mostly redundant.
I tried the incomplete initialization, but it most likely does a malloc in the background without writing to the memory region. Therefore, the memory’s content is random (whatever was written from the last user of that memory region). However, the C code assumes zeros at some positions. So that approach sometimes works and sometimes breaks (as we are all used from developing C code).
I do not have (Julia) references in that struct, but all inner structs are immutable ones to mimic the C struct (some of the inner structs will contain C pointers at one point, but C is taking care of that). So without references, there is no undef.
There is a problem with zero-ing an immutable struct since it might not reside in memory. But suppose the structure is mutable, the following seems to work:
julia> @kwdef mutable struct MutBigStruct
field1::Int=0
field2::Int=0
field3::Int=0
end
MutBigStruct
julia> mbs = MutBigStruct(1,2,3)
MutBigStruct(1, 2, 3)
julia> GC.@preserve mbs begin
p = convert(Ptr{UInt8}, pointer_from_objref(mbs))
@inbounds for i in 1:sizeof(mbs)
unsafe_store!(p, 0, i)
end
end
julia> mbs
MutBigStruct(0, 0, 0)
A few notes:
The GC.@preserve is needed to make sure the structure isn’t moved or GCed while unsafely manipulated.
There is a faster standard Libc function memset to do the work of the for loop, but it seems to be available on 1.10 (I’m on 1.9.4).
Zero memory is not always a valid state for a data structure. So it is all in the unsafe_ category of manipulation.
Yes, zero memory is not always a valid state. However, in this case due to the construction according to a specification it is valid, so from a user’s point of view it’s safe (subject to a correct implementation, but that’s the case with all safe code).
I settled down to the following solution for Julia 1.10:
abstract type ZeroInitializable end
function zeros!(x::ZeroInitializable)
GC.@preserve x Base.memset(pointer_from_objref(x), 0, sizeof(x))
return x
end
mutable struct MutBigStruct <: ZeroInitializable
a::Int
b::Float64
MutBigStruct() = new() |> zeros!
end
Thanks to everybody who was involved in the discussion. If you have feedback to that solution or further thoughts, please comment.