How to efficiently create a struct to store parametric structs in tuples

Hi!

The general rule of thumb when defining parametric structs is that every field must have strictly defined type when all parameters are set. Let’s take a look at your example:

struct Offset{M}
    offsets::NTuple{M, Int}
end

struct SymmetricOffsets{N}
    offsets::NTuple{N,Offset}
end

The Offset types follows this rule - the offsets field of Offset{2} will be always NTuple{2,Int}, or, in other words, Tuple{Int,Int}. However, with SymmetricOffset{2} the type of offsets is not fully defined: it can be Tuple{Offset{1}, Offset{2}}, or Tuple{Offset{3}, Offset{4}} - you get the point.

To avoid this, you can make the type of offsets a parameter - like this:

struct SymmetricOffsets{TupleT}
    offsets::TupleT
end

This may look a bit redundant, but it makes the compiler’s work a lot easier. To enforce some type-checks for convenience, define an inner constructor:

struct SymmetricOffsets{TupleT}
    offsets::TupleT
    offsets(offsets::T) where {T<:Tuple{Vararg{Offset}}} = new{T}(offsets)
end

This will make the performance of SymmetricOffsets and SymmetricOffsets2 identical.

To make it as fast as SymmetricOffsets3, you need to avoid iterating over a tuple with undefined eltype (e. g. in your outer loop). This can be done by replacing it with recursion. Since we never iterate over the tuple explicitly, there will be no type instability.

const SymmetricOffsetsN{N} = SymmetricOffsets{<:Tuple{Vararg{Any,N}}}
@inline function apply_offsets!(res, val, idx, dim, offsets)
    isempty(offsets) && return
    offset = first(offsets)
    for oidx in offset.offsets
        tmp = val[dim] + oidx
        res[idx] = Base.setindex(res[idx],tmp,dim)
        idx += 1
    end
    apply_offsets!(res, val, idx, dim+1, Base.tail(offsets))
end
function apply_offsets(val::NTuple{N,Int},offsets::SymmetricOffsetsN{N}) where N
    res = fill(val,length(offsets))
    apply_offsets!(res, val, 1, 1, offsets.offsets)
    res
end

These are the benchmarks I got:

  37.462 ns (1 allocation: 368 bytes)
  132.453 ns (7 allocations: 832 bytes)
  38.066 ns (1 allocation: 368 bytes)

Also note that if all Offsets were of equal lengths, we would not have needed all this trickery

1 Like