# Strange way to recover performance with polymorphic types

This is a followup of dramatic performance change by adding additional unused method. The explanation was the existing limit on the number of overloaded methods the Julia optimizer is looking to decide the actual return type at compile type.

I have found a way to avoid the allocation problem and recover the performance. The reproducer code is

``````using BenchmarkTools
using StaticArrays

const Vector3 = SVector{3,Float64}

abstract type AbstractShape end

struct Box <: AbstractShape
x::Float64; y::Float64; z::Float64
end
function extent(b::Box)
(Vector3(-b.x,-b.y,-b.z), Vector3(b.x,b.y,b.z))
end

struct SBox <: AbstractShape
x::Float64; y::Float64; z::Float64
end
function extent(b::SBox)
(Vector3(-b.x,-b.y,-b.z), Vector3(b.x,b.y,b.z))
end

struct Circle <: AbstractShape
r::Float64
end
function extent(c::Circle)
(Vector3(-c.r,-c.r,-c.r), Vector3(c.r,c.r,c.r))
end

struct SCircle <: AbstractShape
r::Float64
end
function extent(c::SCircle)
(Vector3(-c.r,-c.r,-c.r), Vector3(c.r,c.r,c.r))
end

struct Triangle <: AbstractShape
a::Float64; b::Float64; c::Float64
end
function extent(t::Triangle)
(Vector3(0,0,0), Vector3(t.a,t.b,t.c))
end

function extent_(a::AbstractShape)
if isa(a,Box)
return extent(a::Box)
elseif isa(a,Circle)
return extent(a::Circle)
elseif isa(a,Circle)
return extent(a::SCircle)
elseif isa(a,Circle)
return extent(a::SBox)
elseif isa(a,Triangle)
return extent(a::Triangle)
end
end

struct Figure
label::String
shape::AbstractShape
end

figure = Figure("1", Box(1,1,1))

function area(fig::Figure)
lower, upper =  extent_(fig.shape)
sum = 0.
for i in 1:1000
sum += (upper-lower) * (upper-lower)
end
sum
end

area(figure)
``````

Note that I am calling `extent_` instead of calling directly `extent` in the `area` function. If I was calling `extent` would result in 8k allocations.
The questions are:

• Is this a good solution?
• Why Julia is not doing the dispatching I am doing in `extent_` automatically when using an abstract type argument?

Obviously the function `extent_` is not very nice since makes the library/module non-extendable. But this solution is much better than many suggestions you kindly offered in the previous discussion thread to enable polymorphism.

2 Likes

Because Julia can’t know that the list of possible subtypes of `AbstractShape` is limited to `Box`, `Circle`, `SCircle`, `SBox`, and `Triangle`. The user could aways define another subtype after your function has been compiled.

In principle, Julia could do this optimization if you instead declared `const MyShapes = Union{Box,Circle,SCircle,SBox,Triangle}` and used `MyShape` as your `Figure.shape` type. I know Julia does this for a sufficiently small union type, but I’m not sure if it will do it for larger unions.

4 Likes

Last time I checked, Julia optimized only Union with two parameters.

1 Like

Las time we checked, I think it was doing to about 16. There is there somewhere in this thread.

But that works for effectively `Union` types. Following the code above, these are some alternatives:

1. Annotate the output of extent:
``````function area2(fig::Figure)
lower, upper =  extent(fig.shape)::Tuple{Vector3,Vector3}
sum = 0.
for i in 1:1000
sum += (upper-lower) * (upper-lower)
end
sum
end
``````

which gives:

``````julia> figure = Figure("1", Box(1,1,1))
Figure("1", Box(1.0, 1.0, 1.0))

julia> @btime area2(\$figure)
1.020 μs (1 allocation: 64 bytes)
4000.0
``````
1. Making `Figure` concrete:
``````struct FigureConcrete{Shape}
label::String
shape::Shape
end

function area3(fig)
lower, upper =  extent(fig.shape)
sum = 0.
for i in 1:1000
sum += (upper-lower) * (upper-lower)
end
sum
end
``````

which gives:

``````julia> figure = FigureConcrete("1", Box(1,1,1))
FigureConcrete{Box}("1", Box(1.0, 1.0, 1.0))

julia> @btime area3(\$figure)
954.727 ns (0 allocations: 0 bytes)
4000.0
``````
1. Using the union types:
``````struct UnionFigure
label::String
shape::Union{Box, Circle, SBox, SCircle, Triangle}
end
``````

which gives:

``````julia> figure = UnionFigure("1", Box(1,1,1))
UnionFigure("1", Box(1.0, 1.0, 1.0))

julia> @btime area3(\$figure)
960.619 ns (0 allocations: 0 bytes)
4000.0
``````

(this one will work well to ~16 types of shapes, as far as I remember)

Whenever possible I would go with (2).

(the example as it is is probably not the best one, it would be more realistic probably if `extent` was called in the loop. Then the type instability is harder. You would get here:

``````area_figure(tup) = (tup-tup) * (tup-tup)

function area4(fig)
sum = 0.
for i in 1:1000
sum += area_figure(extent(fig.shape))
end
sum
end
``````
``````julia> figure = UnionFigure("1", Box(1,1,1))
UnionFigure("1", Box(1.0, 1.0, 1.0))

julia> @btime area4(\$figure)
955.478 ns (0 allocations: 0 bytes)
4000.0

julia> figure = FigureConcrete("1", Box(1,1,1))
FigureConcrete{Box}("1", Box(1.0, 1.0, 1.0))

julia> @btime area4(\$figure)
959.455 ns (0 allocations: 0 bytes)
4000.0

julia> figure = Figure("1",Box(1,1,1))
Figure("1", Box(1.0, 1.0, 1.0))

julia> @btime area4(\$figure)
43.847 μs (3000 allocations: 93.75 KiB)
4000.0

``````
1 Like

Thanks very much for your detailed reply and possible solutions. I am afraid that none is really fully satisfactory solution.

This is really interesting and useful in other cases I have. I did attempt to annotate the actual definition of each `extent` function but it did not have any effect. In any case, there is still one allocation per call and is not the case when calling `extent_`.

I cannot do this since in the complete problem I do have a nested hierarchy of figures and I would require an heterogeneous vector of concrete figures. See the initial thread.

I will try this one providing that the limit is 16 or higher, although it suffers from the limitation that the module will no be extendable with additional figures provided by a client of the module.
I wonder if `Union{Box, Circle, SBox, SCircle, Triangle,AbstrcatShape}` would allow me to add later additional shapes.

1 Like

Note that avoiding that by having abstract figures won’t give much benefit. Probably keeping the abstract part at the highest level possible (the vector) will be better, because the functions operating on each instance will be type stable at least.

Also, if a function acts on such vector, I think you could define a version dispatched on a vector of union types, and another on the abstract type. Internally you could use the faster Union, while allowing the user to extend the types with a fallback to the other alternative. If you manage to keep this dispatch option at the highest possible interface, the choice for one or other strategy will be easy.

But yes, there is no perfect solution. My bet would be to keep that type abstraction as close to the surface as possible, otherwise these problems will permeate everything in the code.

2 Likes