Allocations for abstract field of struct

sstroemer · March 14, 2023, 8:39pm

Considering the following struct definition for _MyTest1:

using BenchmarkTools

abstract type _AbstractReference end

struct _TestStringReference <: _AbstractReference
    val::String
end

struct _TestIntReference <: _AbstractReference
    val::Int64
end

struct _MyTest1
    x::_AbstractReference
end
test1(x::String) = _MyTest1(_TestStringReference(x))
test1(x::Int64) = _MyTest1(_TestIntReference(x))

the following leads to an allocation when using

s = "foobar"
@btime test1($s)
#  3.352 ns (1 allocation: 16 bytes)

compared to

struct _MyTest2
    x::_TestStringReference
end
test2(x::String) = _MyTest2(_TestStringReference(x))

@btime test2($s)
#  1.503 ns (0 allocations: 0 bytes)

as well as a considerable loss in performance each time I access the field:

a = test1(s)
@btime ($a).x.val
#  7.693 ns (0 allocations: 0 bytes)

b = test2(s)
@btime ($b).x.val
#  1.502 ns (0 allocations: 0 bytes)

Is there any “convenient” way to circumvent this?

What I am trying to do (simplified example), is having a constructor MyTest1(x::String) that creates an instance of _MyTest1 containing either a “string reference” (for arbitrary strings) or an “integer reference” (if the string can be parsed into an integer, like “123”).

Sevi · March 14, 2023, 9:08pm

Would something like this work? (haven’t tested it myself)

struct _MyTest1{R <: _AbstractReference}
    x::R
end

function _MyTest1(x)
    parsed_x = # try parsing as Int or keep string
    R = # compute according reference type R
    return new{R}(parsed_x)
end

Any instance of _MyTest1 would have concretely typed fields, so there should be less performance problems when accessing them.

But if you want to loop over something like Vector{_MyTest1} later, the dispatch cost might come back.

EDIT: Got the syntax for parametric structs and methods mixed up.

mikmoore · March 14, 2023, 9:31pm

struct _MyTest3
    x::Union{_TestStringReference, _TestIntReference}
end

Thanks to union splitting, this should avoid most of the performance cost. Of course, this only works if you have a small known-in-advance set of possible types.

sstroemer · March 15, 2023, 12:26pm

Thanks for that link! I tried

struct _MyTest3
    x::Union{_TestIntReference, _TestStringReference}
end

test3(x::String) = _MyTest3(_TestStringReference(x))
test3(x::Int64) = _MyTest3(_TestIntReference(x))

@btime test3($s)
#  3.858 ns (1 allocation: 16 bytes)

a = test3(s)
@btime ($a).x.val
#  1.721 ns (0 allocations: 0 bytes)

which shows that the access is now basically the same as with the hardcoded approach, the allocation and time-overhead during the creation is still there (but I’m creating the object less often than I access it so that could be fine).

sstroemer · March 15, 2023, 12:29pm

Sevi:

Would something like this work? (haven’t tested it myself)
struct _MyTest1{R} where {R<:_AbstractReference}
    x::R
end

function _MyTest1(x)
    parsed_x = # try parsing as Int or keep string
    R = # compute according reference type R
    return new{R}(parsed_x)
end
Any instance of _MyTest1 would have concretely typed fields, so there should be less performance problems when accessing them.

But if you want to loop over something like Vector{_MyTest1} later, the dispatch cost might come back.

I am not sure how I could really pull that off for “large” structs (as in with many fields). For now I rely on dispatch to figure out that I’m passing a pure String that is then “automatically” converted into any of those References. Because I need that for a lot of different fields, and a lot of different “input types” that are then properly converted. So my question is - is there any “meta” way of doing your approach, without me needing to type out dozens of type templates?

lmiq · March 15, 2023, 12:43pm

Usually structs with many fields don’t have such many types, that is, many fields share the same type, such that something like struct A{T1,T2,T3} ... end tends to suffice.

Now, if you really want all possible flexibility without annotating anything, but still keeping fields concrete, you man just want to define a named tuple:

julia> test1(x::String) = (; :x => _TestStringReference(x))
test1 (generic function with 1 method)

julia> test1("foobar")
(x = _TestStringReference("foobar"),)

julia> @btime test1("foobar")
  2.186 ns (0 allocations: 0 bytes)
(x = _TestStringReference("foobar"),)

and, eventually wrap the named tuple in a struct for dispatch, such as

julia> struct _MyTest1{T<:NamedTuple}
           x::T
       end

and define getproperty operations for the _MyTest1 type.

sstroemer · March 15, 2023, 12:55pm

I probably have 3-5 different types, but each field can be of each type, and all combinations between fields are somewhat possible.

Thanks for the NamedTuple + getpropety idea, I’ll try that out!

DNF · March 15, 2023, 12:58pm

I think you mean

struct _MyTest1{R <: _AbstractReference}
    x::R
end

The where clause is for function signatures.

Sevi · March 15, 2023, 3:02pm

Ah yes, thanks for fixing that!

Topic		Replies	Views
Allocation during access to "typed" field of mutable struct Performance question	11	931	November 18, 2021
[SOLVED] 37x performance hit when wrapping Refs. Any solution? Performance	2	459	May 20, 2021
Allocation depending on value of typed field in struct New to Julia question , performance , memory-allocation	4	483	March 23, 2022
Memory allocation in type construction Performance	2	562	September 11, 2018
Accessing abstract array struct field allocates memory? Internals & Design	3	408	February 28, 2023

Allocations for abstract field of struct

Related topics