Type stability when looping over fields in a heterogenous data struct

james3 · July 15, 2023, 2:17am

I have a number of structs that contain arrays that vary in their dimension. I have a function that takes takes these structs, and loops over the fields, performing some operation. Suppose, for simplicity, that I was just calculating the total length of all these arrays. I can accomplish this using the following

function test(xyz)
    total_length = 0
    fields = map(f -> getproperty(xyz, f), propertynames(xyz))
    @inbounds for field in fields
        total_length += length(field)
    end
    return total_length
end

If I have a struct such as

struct y_t
    a::Array{Float64, 1}
    b::Array{Float64, 2}
    c::Array{Float64, 3}
end

with, for example, y = y_t(zeros(3), zeros(10, 2), zeros(2,5,10)), then the function test called with type y_t is not type stable as field is either a vector, matrix, or multi-dimensional array. How can I achieve type stability and avoid any allocations? I wish to maintain heterogeneous data in my structs, if possible. Thanks.

Benny · July 15, 2023, 4:22am

In this case, the compiler was able to determine that field only took on the 3 types in the tuple resulting from the map, so the returned total_length was inferred as Int64. I wouldn’t worry about the method, it doesn’t even allocate.

But if you really don’t like that @code_llvm-red type-unstable variable iterating through a heterogenous tuple, you can do tuple to tuple computations, like test2(xyz) = sum(ntuple(i -> length(getfield(xyz, i)), fieldcount(typeof(xyz)))). That’s for simpler cases, generally I see inlined recursive tuple constructions, like map(f, t::Tuple) = (@inline; (f(t[1]), map(f,tail(t))...)).

jw3126 · July 15, 2023, 6:46am

Another way to get clean code warntype is to use ConstructionBase:

using ConstructionBase
function test2(xyz)
    sum(length, getproperties(xyz))
end

@code_warntype test2(xyz)

# MethodInstance for test2(::y_t)
#   from test2(xyz) in Main at /home/jan/delme/doit.jl:9
# Arguments
#   #self#::Core.Const(test2)
#   xyz::y_t
# Body::Int64
# 1 ─ %1 = Main.getproperties(xyz)::NamedTuple{(:a, :b, :c), Tuple{Vector{Float64}, Matrix{Float64}, A
# rray{Float64, 3}}}
# │   %2 = Main.sum(Main.length, %1)::Int64
# └──      return %2

You can use getproperties and setproperties to convert between “struct” land and NamedTuple land and typically Base has good implementations of doing all kinds of manipulations with NamedTuples.

nsajko · July 15, 2023, 4:06pm

Why not eliminate your problem with something like this:

struct y_t{n, T <: NTuple{n, AbstractArray}}
  arrays::T
end

nsajko · July 15, 2023, 4:08pm

Val(fieldcount(typeof(xyz))) could be better than fieldcount(typeof(xyz)), if the field count is greater than ten, at least.

Henrique_Becker · July 15, 2023, 7:40pm

This structure will be far less performant, no? Because it has AbstractArrayss inside.

Benny · July 15, 2023, 10:31pm

NTuple{3, AbstractArray} is an abstract type like Tuple{T,U,V} where {T<:AbstractArray, U<:AbstractArray, V<:AbstractArray}, so T would just be typeof(arrays).

julia> y = y_t((zeros(3), zeros(10, 2), zeros(2,5,10))); typeof(y)
y_t{3, Tuple{Vector{Float64}, Matrix{Float64}, Array{Float64, 3}}}

nsajko · July 15, 2023, 11:51pm

In T <: NTuple{n, AbstractArray}, the right-hand side is merely a constraint on the left-hand side, saying that T must subtype NTuple{n, AbstractArray}. Note that, unlike other Julia types, tuples are covariant, thus:

julia> Tuple{Vector{Float64}, Matrix{Float64}} <: NTuple{2, AbstractArray}
true

Henrique_Becker · July 17, 2023, 10:44pm

Yes, I forgot about the covariant bit, I am sorry. If it was not this exception, then my comment would have made sense.

Topic		Replies	Views
Storing collections of heterogeneous data General Usage data_structures , type-stability , collections	7	500	May 20, 2023
Type instability with Union or heterogenous array/tuple General Usage type-stability	8	460	March 19, 2022
Looping over struct fieldnames in a type-stable way Performance	3	200	August 5, 2023
Type instability if I map over array of functions General Usage	1	684	August 2, 2018
Type issue in for loop over items in struct field New to Julia code_warntype , type-stability	19	204	August 19, 2024

Type stability when looping over fields in a heterogenous data struct

Related topics