Strange way to recover performance with polymorphic types

peremato · November 29, 2021, 5:23pm

This is a followup of dramatic performance change by adding additional unused method. The explanation was the existing limit on the number of overloaded methods the Julia optimizer is looking to decide the actual return type at compile type.

I have found a way to avoid the allocation problem and recover the performance. The reproducer code is

using BenchmarkTools
using StaticArrays

const Vector3 = SVector{3,Float64}

abstract type AbstractShape end

struct Box <: AbstractShape
    x::Float64; y::Float64; z::Float64
end
function extent(b::Box)
    (Vector3(-b.x,-b.y,-b.z), Vector3(b.x,b.y,b.z))
end

struct SBox <: AbstractShape
    x::Float64; y::Float64; z::Float64
end
function extent(b::SBox)
    (Vector3(-b.x,-b.y,-b.z), Vector3(b.x,b.y,b.z))
end

struct Circle <: AbstractShape
    r::Float64
end
function extent(c::Circle)
    (Vector3(-c.r,-c.r,-c.r), Vector3(c.r,c.r,c.r))
end

struct SCircle <: AbstractShape
    r::Float64
end
function extent(c::SCircle)
    (Vector3(-c.r,-c.r,-c.r), Vector3(c.r,c.r,c.r))
end

struct Triangle <: AbstractShape
    a::Float64; b::Float64; c::Float64
end
function extent(t::Triangle)
    (Vector3(0,0,0), Vector3(t.a,t.b,t.c))
end

function extent_(a::AbstractShape)
    if isa(a,Box)
        return extent(a::Box)
    elseif isa(a,Circle)
        return extent(a::Circle)
    elseif isa(a,Circle)
        return extent(a::SCircle)
    elseif isa(a,Circle)
        return extent(a::SBox)
    elseif isa(a,Triangle)
        return extent(a::Triangle)    
    end
end

struct Figure
    label::String
    shape::AbstractShape
end

figure = Figure("1", Box(1,1,1))

function area(fig::Figure)
    lower, upper =  extent_(fig.shape)
    sum = 0.
    for i in 1:1000
        sum += (upper[1]-lower[1]) * (upper[2]-lower[2])
    end
    sum
end

area(figure)

Note that I am calling extent_ instead of calling directly extent in the area function. If I was calling extent would result in 8k allocations.
The questions are:

Is this a good solution?
Why Julia is not doing the dispatching I am doing in extent_ automatically when using an abstract type argument?

Obviously the function extent_ is not very nice since makes the library/module non-extendable. But this solution is much better than many suggestions you kindly offered in the previous discussion thread to enable polymorphism.

stevengj · November 29, 2021, 5:55pm

Because Julia can’t know that the list of possible subtypes of AbstractShape is limited to Box, Circle, SCircle, SBox, and Triangle. The user could aways define another subtype after your function has been compiled.

In principle, Julia could do this optimization if you instead declared const MyShapes = Union{Box,Circle,SCircle,SBox,Triangle} and used MyShape as your Figure.shape type. I know Julia does this for a sufficiently small union type, but I’m not sure if it will do it for larger unions.

LaurentPlagne · November 29, 2021, 6:05pm

Last time I checked, Julia optimized only Union with two parameters.

lmiq · November 29, 2021, 6:38pm

Las time we checked, I think it was doing to about 16. There is there somewhere in this thread.

But that works for effectively Union types. Following the code above, these are some alternatives:

Annotate the output of extent:

function area2(fig::Figure)
    lower, upper =  extent(fig.shape)::Tuple{Vector3,Vector3}
    sum = 0.
    for i in 1:1000
        sum += (upper[1]-lower[1]) * (upper[2]-lower[2])
    end
    sum
end

which gives:

julia> figure = Figure("1", Box(1,1,1))
Figure("1", Box(1.0, 1.0, 1.0))

julia> @btime area2($figure)
  1.020 μs (1 allocation: 64 bytes)
4000.0

Making Figure concrete:

struct FigureConcrete{Shape}
    label::String
    shape::Shape
end

function area3(fig)
    lower, upper =  extent(fig.shape)
    sum = 0.
    for i in 1:1000
        sum += (upper[1]-lower[1]) * (upper[2]-lower[2])
    end
    sum
end

which gives:

julia> figure = FigureConcrete("1", Box(1,1,1))
FigureConcrete{Box}("1", Box(1.0, 1.0, 1.0))

julia> @btime area3($figure)
  954.727 ns (0 allocations: 0 bytes)
4000.0

Using the union types:

struct UnionFigure
    label::String
    shape::Union{Box, Circle, SBox, SCircle, Triangle}
end

which gives:

julia> figure = UnionFigure("1", Box(1,1,1))
UnionFigure("1", Box(1.0, 1.0, 1.0))

julia> @btime area3($figure)
  960.619 ns (0 allocations: 0 bytes)
4000.0

(this one will work well to ~16 types of shapes, as far as I remember)

Whenever possible I would go with (2).

(the example as it is is probably not the best one, it would be more realistic probably if extent was called in the loop. Then the type instability is harder. You would get here:

area_figure(tup) = (tup[2][1]-tup[1][1]) * (tup[2][2]-tup[1][2])

function area4(fig)
    sum = 0.
    for i in 1:1000
        sum += area_figure(extent(fig.shape))
    end
    sum
end

julia> figure = UnionFigure("1", Box(1,1,1))
UnionFigure("1", Box(1.0, 1.0, 1.0))

julia> @btime area4($figure)
  955.478 ns (0 allocations: 0 bytes)
4000.0

julia> figure = FigureConcrete("1", Box(1,1,1))
FigureConcrete{Box}("1", Box(1.0, 1.0, 1.0))

julia> @btime area4($figure)
  959.455 ns (0 allocations: 0 bytes)
4000.0

julia> figure = Figure("1",Box(1,1,1))
Figure("1", Box(1.0, 1.0, 1.0))

julia> @btime area4($figure)
  43.847 μs (3000 allocations: 93.75 KiB)
4000.0

peremato · November 30, 2021, 8:41am

Thanks very much for your detailed reply and possible solutions. I am afraid that none is really fully satisfactory solution.

This is really interesting and useful in other cases I have. I did attempt to annotate the actual definition of each extent function but it did not have any effect. In any case, there is still one allocation per call and is not the case when calling extent_.

I cannot do this since in the complete problem I do have a nested hierarchy of figures and I would require an heterogeneous vector of concrete figures. See the initial thread.

I will try this one providing that the limit is 16 or higher, although it suffers from the limitation that the module will no be extendable with additional figures provided by a client of the module.
I wonder if Union{Box, Circle, SBox, SCircle, Triangle,AbstrcatShape} would allow me to add later additional shapes.

lmiq · November 30, 2021, 8:57am

Note that avoiding that by having abstract figures won’t give much benefit. Probably keeping the abstract part at the highest level possible (the vector) will be better, because the functions operating on each instance will be type stable at least.

Also, if a function acts on such vector, I think you could define a version dispatched on a vector of union types, and another on the abstract type. Internally you could use the faster Union, while allowing the user to extend the types with a fallback to the other alternative. If you manage to keep this dispatch option at the highest possible interface, the choice for one or other strategy will be easy.

But yes, there is no perfect solution. My bet would be to keep that type abstraction as close to the surface as possible, otherwise these problems will permeate everything in the code.

Topic		Replies	Views
Dramatic performance change by adding additional unused method Performance	33	1938	November 23, 2021
Performance problems when dealing with deliberately type-unstable code. How to use type knowledge better? Performance	33	2069	December 30, 2020
Allocation and slow down when # of types involved increase (slower than C++ virtual methods) Performance question , hep	57	2830	November 9, 2022
Alternate ways to deal with arrays of subtypes of abstract type Performance type , type-stability	11	1168	April 19, 2022
Avoiding Vectors of Abstract Types Performance question , type-stability	22	4428	February 17, 2022

Strange way to recover performance with polymorphic types

Related topics