Dramatic performance change by adding additional unused method

This is a followup of not-understanding-linear-growing-allocations. It took me a while to get a manageable reproducer after my first attempt of forgetting a simple const in a type alias. The code is as follows:

using BenchmarkTools
using StaticArrays

const Vector3 = SVector{3,Float64}

abstract type AbstractShape end

struct Box <: AbstractShape
    x::Float64; y::Float64; z::Float64
end
function extent(b::Box)
    (Vector3(-b.x,-b.y,-b.z), Vector3(b.x,b.y,b.z))
end

struct SBox <: AbstractShape
    x::Float64; y::Float64; z::Float64
end
function extent(b::SBox)
    (Vector3(-b.x,-b.y,-b.z), Vector3(b.x,b.y,b.z))
end

struct Circle <: AbstractShape
    r::Float64
end
function extent(c::Circle)
    (Vector3(-c.r,-c.r,-c.r), Vector3(c.r,c.r,c.r))
end

struct SCircle <: AbstractShape
    r::Float64
end
function extent(c::SCircle)
    (Vector3(-c.r,-c.r,-c.r), Vector3(c.r,c.r,c.r))
end

struct Figure
    label::String
    shape::AbstractShape
end

figure = Figure("1", Box(1,1,1))

function area(fig::Figure)
    lower, upper =  extent(fig.shape)
    sum = 0.
    for i in 1:1000
        sum += (upper[1]-lower[1]) * (upper[2]-lower[2])
    end
    sum
end

Running @time area(figure) you get 8k allocations. With @code_warntype you can see that the problem is with Main.extent(%1)::Any returning Any instead of returning Tuple{Vector3,Vector3}. @trace from Traceur.jl didn’t help me.

To solve the problem you just need to remove (or comment) one of the overloads of extent. For example:

struct SCircle <: AbstractShape
    r::Float64
end
#function extent(c::SCircle)
#    (Vector3(-c.r,-c.r,-c.r), Vector3(c.r,c.r,c.r))
#end

You then get

julia> @time area(figure)
  0.000002 seconds
4000.0

The question is, why is this? If this a bug? is this a stupid thing that I have done?
Thanks very much for your help.

Try using

Having abstract fields is always a performance problem.

(I know that even if this solves the problem it does nor answer your doubt exactly. It may have to do with the fact that with the abstract type the function area cannot specialize)

This is not possible. I need polymorphism. Perhaps I could do it with Any. But in any case, why it works for 3 overloads and not for 4.

You could annotate the output here (as shown).

(Though I’m not sure if the previous suggestion is incompatible with what you need).

There is a limit up to which the compiler does union splitting automatically, and above which it will just fall back to dynamic dispatch.

5 Likes

Could you elaborate a bit more on why you need this? It isn’t clear from your example since that would work perfectly fine with what @leandromartinez98 suggested. Are you changing the shape of a figure dynamically (i.e. assign figure.shape to a different shape)? Or is it just that you don’t know upfront which shape you will need?

1 Like
struct Figure{S} where S <: AbstractShape
    label::String
    shape::S
end

This is a syntax error for me. Perhaps you meant

struct Figure{S<:AbstractShape}
    label::String
    shape::S
end

?

But this does give you polymorphism. You can’t mutate it at runtime, but you cannot do that anyway with struct, you would need mutable struct for that.

(Edit: Defining the following method gives you a convenient constructor:

Figure(label, shape::T) where {T} = Figure{T}(label, shape)

)

4 Likes

these could have been:

struct Box{T} <: AbstractShape
    x::T; y::T; z::T
end

and then this really should be:

struct Figure{T<:AbstractShape} where 
    label::String
    shape::T
end

if you don’t parameterize it, you’re forcing it to be (possibly) slow by forcing abstract type boxing

2 Likes

This is a surprise to me. I do not mind if there is dynamic dispatch, I was actually expecting it (equivalent in C++ to virtual call) but what I was not expecting that it is allocating memory each time is dispatching.

I working a prototype geometry modeler (see GitHub - peremato/Geom4hep). I’ll be dealing with collections of nested structures of the equivalent Figure in the reproducer. Figure has to be generic but it has a Shape that is concretized with the different types of shapes. I am not changing the shape dynamically, Once I create a Figure is inmutable. Is this helping to describe the problem?

Indeed I am using struct Box{T} <: AbstractShape. I just simplified in the reproducer.

In the actual code, equivalent to Figure is already doubled parametrized: by the floating point type, and by the nested structure because of impossibility to have forward declarations. The actual code looks like this:

#---Volume-------------------------------------------------------------------------
struct Mother{T,PV}
    label::String
    shape::AbstractShape{T}             # Reference to the actual shape
    material::Material                  # Reference to material
    daughters::Vector{PV} 
end

#---PlacedVolume-------------------------------------------------------------------
struct PlacedVolume{T<:AbstractFloat}
    transformation::Transformation3D{T}
    volume::Mother{T,PlacedVolume{T}}
end
const Volume{T} = Mother{T,PlacedVolume{T}} where T<:AbstractFloat

Note that the Material in rally should also be abstract since we have different ways to define materials. So, with your suggestion I would have proliferation of new types: Volume{T, Shape, Material}

I think that is what it is, effectively, if you follow that pattern. You could also parameterize more loosely, like:

struct Mother{T,S,M,PV}
    label::String
    shape::S             # Reference to the actual shape
    material::M                  # Reference to material
    daughters::Vector{PV} 
end

which will allow the types to be concrete, or go for a completely different layout to allow the Mother type to be simpler yet concrete. For instance, if instead of the actual Material as a field you have a label for the material type (as an integer, for instance, to enumerate the materials).

There are some threads discussing this type of pattern:

I am not sure if there is an “ideal” solution (in Julia or other languages). The default performance of what people usually implement in C++ seems to be faster than straightforward Julia implementation of the same thing, but there are a lot of nuances and possibilities which are discussed there.

Yes, thanks, I have to rationalize why these things are not equivalent at some point to remember which is the one that works.

The T parameter is redundant here.

@peremato : One can narrow the types down again, without going overboard with the complexity:

struct Mother{S<:AbstractShape,M<:Material,PV<:PlacedVolume}
    label::String
    shape::S             # Reference to the actual shape
    material::M                  # Reference to material
    daughters::Vector{PV} 
end

The pattern is, you do

struct MyType{T<:AbstractSuper}
    x::T
end

instead of

struct MyType{T}
    x::AbstractSuper{T}
end

You get more concrete fieldtypes without more typing (well, the keyboard kind of typing.)

1 Like

Thanks for fixing my poor examples :slight_smile:

1 Like

Thanks for your suggestion but I still have a problem with the second part, which is the nested structures that are coded with PlacedVolume.

struct Mother{T,S,M,PV}
    label::String
    shape::S             # Reference to the actual shape
    material::M                  # Reference to material
    daughters::Vector{PV} 
end

struct PlacedVolume{T,....}
    transformation::Transformation3D{T}
    volume::Mother{T,S,M,PlacedVolume{T,...}}
end

I do not know what to put in the .... At a given moment I must break the ‘concreteness’ of the geometry. I cannot parametrize a mother volume with all the concrete types of the daughters, and the daughters of the daughters. I can do it for example as I did, at the level of the Mother having a member that is abstract (the shape), or I could do it a the Vector{PV} by having a Vector{AbstractPlacement}, but I would guess I will end with the same performance problems. Your help will be most welcome to find the right pattern.

1 Like

In terms of expressing it, you can just use the same pattern again:

struct PlacedVolume{T,M}
    transformation::T
    volume::M
end

and then putting constraints inside the braces, if you want. That will clean up the proliferation of parameters.

But this definition becomes mutually recursive. I actually don’t know how well the compiler handles that. You can try?

1 Like

you don’t have to enforce the correctness of this with typing system, you could enforce it with the constructor function and simply leave it as what DNF has suggested. Of course, you could always work some of the constraint into the type system if it’s natural and easy to do, but you don’t have to.

If you want polymorphism and performance (both runtime and compile time), I’d suggest trying a package like Unityper or doing it manually via enums.

2 Likes

My brain is exploding because of the recursivelity. The problem is that I have a Vector of different shaped ‘daughters’. Either, each volume is the same regardless of the shape (what I did) and then I have an uniform Vector, or I do have a different volume for each type of shape and I do have an heterogenous Vector. Clearly I cannot express in a constructor all these complexity. My guess is that having an heterogenous Vector will boil down to the same poor performance.

1 Like