I am defining a new struct for a project, which I need to be fast. This struct only contains one parameter, an MVector
(from the StaticArrays.jl
package). However, the operations on the struct are 4x slower than those on the MVector directly.
Here is the definition of the struct:
struct Foo{N}
dev::MVector{N}
Foo(v::A) where A <: Union{Vector, SVector, MVector} = new{length(v)}(MVector{length(v)}(v))
end
Base.copy(foo::Foo) = Foo(foo.dev)
function Base.:+(a::Foo, b::Foo)
return Foo(a.dev + b.dev)
end
And the benchmarks:
julia> mvec = MVector{100}(rand(100))
[...]
julia> x = Foo(rand(100))
[...]
julia> @benchmark copy($mvec)
BenchmarkTools.Trial: 10000 samples with 984 evaluations.
Range (min β¦ max): 43.874 ns β¦ 323.259 ns β GC (min β¦ max): 0.00% β¦ 71.34%
Time (median): 62.593 ns β GC (median): 0.00%
Time (mean Β± Ο): 67.954 ns Β± 24.152 ns β GC (mean Β± Ο): 3.85% Β± 8.97%
βββββββ β
ββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββ
β β
43.9 ns Histogram: log(frequency) by time 221 ns <
Memory estimate: 816 bytes, allocs estimate: 1.
julia> @benchmark copy($x)
BenchmarkTools.Trial: 10000 samples with 822 evaluations.
Range (min β¦ max): 149.393 ns β¦ 527.189 ns β GC (min β¦ max): 0.00% β¦ 65.09%
Time (median): 161.423 ns β GC (median): 0.00%
Time (mean Β± Ο): 165.675 ns Β± 33.197 ns β GC (mean Β± Ο): 2.35% Β± 7.32%
β
ββββ β
ββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββ
β
β
ββ β
149 ns Histogram: log(frequency) by time 417 ns <
Memory estimate: 832 bytes, allocs estimate: 2.
julia> @benchmark +($mvec, $mvec)
BenchmarkTools.Trial: 10000 samples with 987 evaluations.
Range (min β¦ max): 37.670 ns β¦ 386.473 ns β GC (min β¦ max): 0.00% β¦ 46.21%
Time (median): 60.061 ns β GC (median): 0.00%
Time (mean Β± Ο): 62.614 ns Β± 24.071 ns β GC (mean Β± Ο): 4.30% Β± 9.31%
βββββββ β
ββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββ β
37.7 ns Histogram: log(frequency) by time 217 ns <
Memory estimate: 816 bytes, allocs estimate: 1.
julia> @benchmark +($x, $x)
BenchmarkTools.Trial: 10000 samples with 492 evaluations.
Range (min β¦ max): 223.604 ns β¦ 915.809 ns β GC (min β¦ max): 0.00% β¦ 48.79%
Time (median): 240.069 ns β GC (median): 0.00%
Time (mean Β± Ο): 248.639 ns Β± 56.365 ns β GC (mean Β± Ο): 2.50% Β± 7.61%
ββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
ββββββ β
224 ns Histogram: log(frequency) by time 607 ns <
Memory estimate: 1.61 KiB, allocs estimate: 3.
Is there something I am doing wrong? How can I fix this?