Type stability when looping over fields in a heterogenous data struct

I have a number of structs that contain arrays that vary in their dimension. I have a function that takes takes these structs, and loops over the fields, performing some operation. Suppose, for simplicity, that I was just calculating the total length of all these arrays. I can accomplish this using the following

function test(xyz)
    total_length = 0
    fields = map(f -> getproperty(xyz, f), propertynames(xyz))
    @inbounds for field in fields
        total_length += length(field)
    end
    return total_length
end

If I have a struct such as

struct y_t
    a::Array{Float64, 1}
    b::Array{Float64, 2}
    c::Array{Float64, 3}
end

with, for example, y = y_t(zeros(3), zeros(10, 2), zeros(2,5,10)), then the function test called with type y_t is not type stable as field is either a vector, matrix, or multi-dimensional array. How can I achieve type stability and avoid any allocations? I wish to maintain heterogeneous data in my structs, if possible. Thanks.

In this case, the compiler was able to determine that field only took on the 3 types in the tuple resulting from the map, so the returned total_length was inferred as Int64. I wouldn’t worry about the method, it doesn’t even allocate.

But if you really don’t like that @code_llvm-red type-unstable variable iterating through a heterogenous tuple, you can do tuple to tuple computations, like test2(xyz) = sum(ntuple(i -> length(getfield(xyz, i)), fieldcount(typeof(xyz)))). That’s for simpler cases, generally I see inlined recursive tuple constructions, like map(f, t::Tuple) = (@inline; (f(t[1]), map(f,tail(t))...)).

1 Like

Another way to get clean code warntype is to use ConstructionBase:

using ConstructionBase
function test2(xyz)
    sum(length, getproperties(xyz))
end

@code_warntype test2(xyz)
# MethodInstance for test2(::y_t)
#   from test2(xyz) in Main at /home/jan/delme/doit.jl:9
# Arguments
#   #self#::Core.Const(test2)
#   xyz::y_t
# Body::Int64
# 1 ─ %1 = Main.getproperties(xyz)::NamedTuple{(:a, :b, :c), Tuple{Vector{Float64}, Matrix{Float64}, A
# rray{Float64, 3}}}
# │   %2 = Main.sum(Main.length, %1)::Int64
# └──      return %2

You can use getproperties and setproperties to convert between “struct” land and NamedTuple land and typically Base has good implementations of doing all kinds of manipulations with NamedTuples.

5 Likes

Why not eliminate your problem with something like this:

struct y_t{n, T <: NTuple{n, AbstractArray}}
  arrays::T
end
1 Like

Val(fieldcount(typeof(xyz))) could be better than fieldcount(typeof(xyz)), if the field count is greater than ten, at least.

1 Like

This structure will be far less performant, no? Because it has AbstractArrayss inside.

NTuple{3, AbstractArray} is an abstract type like Tuple{T,U,V} where {T<:AbstractArray, U<:AbstractArray, V<:AbstractArray}, so T would just be typeof(arrays).

julia> y = y_t((zeros(3), zeros(10, 2), zeros(2,5,10))); typeof(y)
y_t{3, Tuple{Vector{Float64}, Matrix{Float64}, Array{Float64, 3}}}
2 Likes

In T <: NTuple{n, AbstractArray}, the right-hand side is merely a constraint on the left-hand side, saying that T must subtype NTuple{n, AbstractArray}. Note that, unlike other Julia types, tuples are covariant, thus:

julia> Tuple{Vector{Float64}, Matrix{Float64}} <: NTuple{2, AbstractArray}
true

Yes, I forgot about the covariant bit, I am sorry. If it was not this exception, then my comment would have made sense.

1 Like