Zero-allocating Instance of Struct

I have large structs with a lot of fields (think of multiple 100s) which I need to communicate with a C library. Julia creates the instances of a struct (memory allocation) but C fills in the values. Keeping its memory undefined on the Julia side should work, but it doesn’t as it turns out that the involved C code does not care about the initial memory content as long as it’s all zeros ;-).

Luckily, I can auto-generate the structs in Julia from a specification file quite easily. However, I am struggling with the initialization of a struct’s instance. The best way of constructing an all-zero instance I currently know of is to call each field’s constructor (using a support macro) and adding appropriate constructors where necessary which is some work with all the different types involved. Additionally, the necessary field types’ constructors sometimes do not really make a lot of sense on their own.

Is there an easier way to construct an all-zero instance of a struct (known to Julia’s GC)?

My thoughts up so far:
I could call calloc directly instead of calling a constructor, but then the memory would not be known to Julia’s garbage collector and I would need to keep track of memory management myself which I’d rather avoid.
But if Julia offers a similar call and would inform the GC, that would be nice.

Additionally, I am thinking of something like zeros, but for a struct instead of an Array. zeros calls fill!(array, 0), which for byte arrays calls memset. I think I could implement an analogous zeros method for a struct called in an inner constructor using incomplete initialization.
But maybe such a functionality exists already?

1 Like

You could define your big struct in the following manner, which will give you a constructor that does not involve passing values for fields:

@kwdef struct MyBigStruct
    field1::Int=0
    field2::Int=0
    field3::Int=0
end

someinstance = MyBigStruct()
# will produce: MyBigStruct(0, 0, 0)

However, maybe a better approach might be to use something in the flavor of the following function:

function initzeros(::Type{T}) where {T}
    fvals = fieldtypes(T)
    T(zero.(fvals)...)
end

struct BigOne
    field1::Int
    field2::Float64
    field3::Rational
    .... lots of additional fields
    field1000::Int8
end

niceinstance = initzeros(BigOne)
# will output your struct instance with all fields `zero` (in a type-safe manner).
1 Like

Thanks for that idea. It would simplify the creation indeed a bit, because I’d need to touch each field only once (and not effectively a second time for the constructor call).

However, I think the major problem would persist that either every type must support the same constructor (e.g. accepting a 0), which is quiet unnatural for some field types (think of a String or an immutable struct), or I’d need to maintain a list which constructor to call for which type. Both is not especially elegant.

Check out the updated answer - you might find some use of that initzeros approach.

P. S. This would imply adding zero methods for String and potentially other types.

Interesting, thanks, that’s definitely an improvement of my current status. Instead of implementing unnatural constructors for the fields’ types, I’d only need to implement a much more natural zero method for the types which lack one.

I am still hoping for a “one knob” solution which does not need to add multiple methods. Of course, I could try to implement zeros for type Any based on some recursive approach, but that’s probably hard to get right in general.

Let’s see whether there are more ideas.

1 Like

You can always consider the undef approach:

struct B
    field1::Float64
    field2::Int
    field3::Rational
    field4::String
    B() = new()
end

initialized = B()
# will produce: B(0.0, 0, #undef, #undef)

You no longer need to concern yourself about defining additional zero methods for all kinds of types.

My understanding is that the undef approach is compatible with what you wrote here (and it turns out that on Julia’s side is easier to go with undef than defining zero methods for additional types).

It will not always produce that. Rational is missing a type parameter to be concrete, then it will act similarly to fields 1 and 2. For bitstypes you’ll get random values (which just might happen to be 0 more often than other values because that’s a common state for memory to be in):

julia> Vector{Rational{Int}}(undef, 1000)
1000-element Vector{Rational{Int64}}:
      -4096//0
      -4096//34359738368
 4508212816//105553164661728
           ⋮
          0//0
          0//1065353216

julia> Vector{Float64}(undef, 1000)
1000-element Vector{Float64}:
 2.121995791e-314
 0.0
 0.0
 ⋮
 2.5708623274e-314
 2.5708615527e-314

julia> Vector{Int64}(undef, 1000)
1000-element Vector{Int64}:
               0
               0
               0
               ⋮
 105553156116416
      5747432632
2 Likes

Update: Sorry, just saw @jules comment, so this is mostly redundant.

I tried the incomplete initialization, but it most likely does a malloc in the background without writing to the memory region. Therefore, the memory’s content is random (whatever was written from the last user of that memory region). However, the C code assumes zeros at some positions. So that approach sometimes works and sometimes breaks (as we are all used from developing C code).

I do not have (Julia) references in that struct, but all inner structs are immutable ones to mimic the C struct (some of the inner structs will contain C pointers at one point, but C is taking care of that). So without references, there is no undef.

1 Like

This clarifies it - so I think we are at the square zero:

Let’s see if anybody can propose a general solution that will not involve defining myriads of zero methods.

There is a problem with zero-ing an immutable struct since it might not reside in memory. But suppose the structure is mutable, the following seems to work:

julia> @kwdef mutable struct MutBigStruct
           field1::Int=0
           field2::Int=0
           field3::Int=0
       end
MutBigStruct

julia> mbs = MutBigStruct(1,2,3)
MutBigStruct(1, 2, 3)

julia> GC.@preserve mbs begin
           p = convert(Ptr{UInt8}, pointer_from_objref(mbs))
           @inbounds for i in 1:sizeof(mbs)
               unsafe_store!(p, 0, i)
           end
           end

julia> mbs
MutBigStruct(0, 0, 0)

A few notes:

  1. The GC.@preserve is needed to make sure the structure isn’t moved or GCed while unsafely manipulated.
  2. There is a faster standard Libc function memset to do the work of the for loop, but it seems to be available on 1.10 (I’m on 1.9.4).
  3. Zero memory is not always a valid state for a data structure. So it is all in the unsafe_ category of manipulation.
2 Likes

Thanks, Dan, you are right, I am talking about a mutable struct here, sorry for not mentioning it.

Your solution is very similar to the 1.10 code of Array fill! method without using memset.

  1. Yes, zero memory is not always a valid state. However, in this case due to the construction according to a specification it is valid, so from a user’s point of view it’s safe (subject to a correct implementation, but that’s the case with all safe code).

I settled down to the following solution for Julia 1.10:

abstract type ZeroInitializable end
function zeros!(x::ZeroInitializable)
    GC.@preserve x Base.memset(pointer_from_objref(x), 0, sizeof(x))
    return x
end

mutable struct MutBigStruct <: ZeroInitializable
    a::Int
    b::Float64
    MutBigStruct() = new() |> zeros!
end

Thanks to everybody who was involved in the discussion. If you have feedback to that solution or further thoughts, please comment.

3 Likes