This is a followup of dramatic performance change by adding additional unused method. The explanation was the existing limit on the number of overloaded methods the Julia optimizer is looking to decide the actual return type at compile type.
I have found a way to avoid the allocation problem and recover the performance. The reproducer code is
using BenchmarkTools
using StaticArrays
const Vector3 = SVector{3,Float64}
abstract type AbstractShape end
struct Box <: AbstractShape
x::Float64; y::Float64; z::Float64
end
function extent(b::Box)
(Vector3(-b.x,-b.y,-b.z), Vector3(b.x,b.y,b.z))
end
struct SBox <: AbstractShape
x::Float64; y::Float64; z::Float64
end
function extent(b::SBox)
(Vector3(-b.x,-b.y,-b.z), Vector3(b.x,b.y,b.z))
end
struct Circle <: AbstractShape
r::Float64
end
function extent(c::Circle)
(Vector3(-c.r,-c.r,-c.r), Vector3(c.r,c.r,c.r))
end
struct SCircle <: AbstractShape
r::Float64
end
function extent(c::SCircle)
(Vector3(-c.r,-c.r,-c.r), Vector3(c.r,c.r,c.r))
end
struct Triangle <: AbstractShape
a::Float64; b::Float64; c::Float64
end
function extent(t::Triangle)
(Vector3(0,0,0), Vector3(t.a,t.b,t.c))
end
function extent_(a::AbstractShape)
if isa(a,Box)
return extent(a::Box)
elseif isa(a,Circle)
return extent(a::Circle)
elseif isa(a,Circle)
return extent(a::SCircle)
elseif isa(a,Circle)
return extent(a::SBox)
elseif isa(a,Triangle)
return extent(a::Triangle)
end
end
struct Figure
label::String
shape::AbstractShape
end
figure = Figure("1", Box(1,1,1))
function area(fig::Figure)
lower, upper = extent_(fig.shape)
sum = 0.
for i in 1:1000
sum += (upper[1]-lower[1]) * (upper[2]-lower[2])
end
sum
end
area(figure)
Note that I am calling extent_
instead of calling directly extent
in the area
function. If I was calling extent
would result in 8k allocations.
The questions are:
- Is this a good solution?
- Why Julia is not doing the dispatching I am doing in
extent_
automatically when using an abstract type argument?
Obviously the function extent_
is not very nice since makes the library/module non-extendable. But this solution is much better than many suggestions you kindly offered in the previous discussion thread to enable polymorphism.