Iβm getting some quite unexpected non-deterministic benchmarking results on a toy problem.
abstract type Shape end
area(::Shape) = 0.0
struct Square <: Shape
side::Float64
end
area(s::Square) = s.side * s.side
struct Rectangle <: Shape
width::Float64
height::Float64
end
area(r::Rectangle) = r.width * r.height
struct Triangle <: Shape
base::Float64
height::Float64
end
area(t::Triangle) = 1.0/2.0 * t.base * t.height
struct Circle <: Shape
radius::Float64
end
area(c::Circle) = Ο * c.radius^2
I have one million shapes that Iβve generated and hereβs a screenshot that I captured that shows this particular βunusualβ benchmarking result:
main1 is defined as main1(shapes) = sum(area.(shapes)) and @benchmark gives 16ms on average for Vector{Any} and 37ms on average for Vector{Shape}.
I am expecting these to be similar in performance.
However, this is not reproducible and if I run it again, I get the results that I expect to get:
Julia Version 1.10.2
Commit bd47eca2c8a (2024-03-01 10:14 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: macOS (arm64-apple-darwin22.4.0)
CPU: 10 Γ Apple M1 Pro
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, apple-m1)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Environment:
JULIA_PROJECT = @.
This has happened enough times randomly that Iβm worried Iβm doing something wrong. Can anyone else reproduce this? Am I using benchmarking tools incorrectly or missing something else?
As an aside, maybe a package I built (MixedStructTypes.jl) could be of help to you:
using MixedStructTypes, BenchmarkTools
abstract type AbstractShape end
@sum_structs Shape <: AbstractShape begin
@kwdef struct Square
side::Float64 = rand()
end
@kwdef struct Rectangle
width::Float64 = rand()
height::Float64 = rand()
end
@kwdef struct Triangle
base::Float64 = rand()
height::Float64 = rand()
end
@kwdef struct Circle
radius::Float64 = rand()
end
end
function area(sh)
if kindof(sh) === :Square
area_square(sh)
elseif kindof(sh) === :Rectangle
area_rectangle(sh)
elseif kindof(sh) === :Triangle
area_triangle(sh)
elseif kindof(sh) === :Circle
area_circle(sh)
else
area_default(sh)
end
end
area_square(s) = s.side * s.side
area_rectangle(r) = r.width * r.height
area_triangle(t) = 1.0/2.0 * t.base * t.height
area_circle(c) = Ο * c.radius^2
area_default(::AbstractShape) = 0.0
main1(shapes) = sum(area.(shapes))
count = 1_000_000
shapes = [rand((Square,Rectangle,Triangle,Circle))() for _ in 1:count];
which results in
julia> @benchmark main1($shapes)
BenchmarkTools.Trial: 725 samples with 1 evaluation.
Range (min β¦ max): 6.211 ms β¦ 10.326 ms β GC (min β¦ max): 0.00% β¦ 10.52%
Time (median): 6.246 ms β GC (median): 0.00%
Time (mean Β± Ο): 6.887 ms Β± 1.168 ms β GC (mean Β± Ο): 1.56% Β± 4.32%
ββ β β β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
6.21 ms Histogram: log(frequency) by time 9.49 ms <
Memory estimate: 7.63 MiB, allocs estimate: 2.
It is still a bit annoying that you need to use if-else inside the area function though, Iβm planning to lift this though and provide instead a macro which does everything by itself, with a syntax like this:
Thatβs a neat package! I had seen SumTypes.jl before but didnβt know about MixedStructTypes.jl. I havenβt had the chance to use them yet!
Maybe something to do with the GC? Thatβs the only explanation I have. Because the function is compiled at this point, it shouldnβt really behave differently on multiple runs.
I also find it odd that the deviations in each benchmark are small, meaning if it is a GC issue, the issue only kicks in between benchmarking runs but not during the 100-300 samples while benchmarking.
Side note: this would probably be more efficient as sum(area, shapes), which runs the function in the first argument on every element in the second argument while summing the result. This removes the need to allocate an intermediate array with the area function results.
It is faster when running sum(main1, (arr1, arr2, arr3, arr4)) with fewer allocations, when all the arrays are of a type with a concrete generic parameter, so you are right in that regard.
If you cannot avoid containers with abstract value types, it is sometimes better to parametrize with Any to avoid runtime type checking. E.g. IdDict{Any, Any} performs better than IdDict{Type, Vector}