Allocations for abstract field of struct

Considering the following struct definition for _MyTest1:

using BenchmarkTools

abstract type _AbstractReference end

struct _TestStringReference <: _AbstractReference
    val::String
end

struct _TestIntReference <: _AbstractReference
    val::Int64
end

struct _MyTest1
    x::_AbstractReference
end
test1(x::String) = _MyTest1(_TestStringReference(x))
test1(x::Int64) = _MyTest1(_TestIntReference(x))

the following leads to an allocation when using

s = "foobar"
@btime test1($s)
#  3.352 ns (1 allocation: 16 bytes)

compared to

struct _MyTest2
    x::_TestStringReference
end
test2(x::String) = _MyTest2(_TestStringReference(x))

@btime test2($s)
#  1.503 ns (0 allocations: 0 bytes)

as well as a considerable loss in performance each time I access the field:

a = test1(s)
@btime ($a).x.val
#  7.693 ns (0 allocations: 0 bytes)

b = test2(s)
@btime ($b).x.val
#  1.502 ns (0 allocations: 0 bytes)

Is there any “convenient” way to circumvent this?

What I am trying to do (simplified example), is having a constructor MyTest1(x::String) that creates an instance of _MyTest1 containing either a “string reference” (for arbitrary strings) or an “integer reference” (if the string can be parsed into an integer, like “123”).

Would something like this work? (haven’t tested it myself)

struct _MyTest1{R <: _AbstractReference}
    x::R
end

function _MyTest1(x)
    parsed_x = # try parsing as Int or keep string
    R = # compute according reference type R
    return new{R}(parsed_x)
end

Any instance of _MyTest1 would have concretely typed fields, so there should be less performance problems when accessing them.

But if you want to loop over something like Vector{_MyTest1} later, the dispatch cost might come back.

EDIT: Got the syntax for parametric structs and methods mixed up.

2 Likes
struct _MyTest3
    x::Union{_TestStringReference, _TestIntReference}
end

Thanks to union splitting, this should avoid most of the performance cost. Of course, this only works if you have a small known-in-advance set of possible types.

2 Likes

Thanks for that link! I tried

struct _MyTest3
    x::Union{_TestIntReference, _TestStringReference}
end

test3(x::String) = _MyTest3(_TestStringReference(x))
test3(x::Int64) = _MyTest3(_TestIntReference(x))

@btime test3($s)
#  3.858 ns (1 allocation: 16 bytes)

a = test3(s)
@btime ($a).x.val
#  1.721 ns (0 allocations: 0 bytes)

which shows that the access is now basically the same as with the hardcoded approach, the allocation and time-overhead during the creation is still there (but I’m creating the object less often than I access it so that could be fine).

I am not sure how I could really pull that off for “large” structs (as in with many fields). For now I rely on dispatch to figure out that I’m passing a pure String that is then “automatically” converted into any of those References. Because I need that for a lot of different fields, and a lot of different “input types” that are then properly converted. So my question is - is there any “meta” way of doing your approach, without me needing to type out dozens of type templates?

Usually structs with many fields don’t have such many types, that is, many fields share the same type, such that something like struct A{T1,T2,T3} ... end tends to suffice.

Now, if you really want all possible flexibility without annotating anything, but still keeping fields concrete, you man just want to define a named tuple:

julia> test1(x::String) = (; :x => _TestStringReference(x))
test1 (generic function with 1 method)

julia> test1("foobar")
(x = _TestStringReference("foobar"),)

julia> @btime test1("foobar")
  2.186 ns (0 allocations: 0 bytes)
(x = _TestStringReference("foobar"),)

and, eventually wrap the named tuple in a struct for dispatch, such as

julia> struct _MyTest1{T<:NamedTuple}
           x::T
       end

and define getproperty operations for the _MyTest1 type.

3 Likes

I probably have 3-5 different types, but each field can be of each type, and all combinations between fields are somewhat possible.

Thanks for the NamedTuple + getpropety idea, I’ll try that out!

I think you mean

struct _MyTest1{R <: _AbstractReference}
    x::R
end

The where clause is for function signatures.

1 Like

Ah yes, thanks for fixing that!