I am defining a custom data type that behaves like a vector (in the linear algebra sense). I defined it as a structure with a 6-element StaticVector
that holds the underlying data:
using StaticArrays
struct MyVectorType
data::SVector{6, Float64}
end
One operation that I need is the cross product between instances of MyVectorType
, which is defined by a particular combination of cross products between 3-component sub-vectors of the underlying data:
import LinearAlgebra: ×
function cross_product1(a::MyVectorType, b::MyVectorType)
#split the vectors in two 3-component parts:
a1 = a.data[SVector(1,2,3)]
a2 = a.data[SVector(4,5,6)]
b1 = b.data[SVector(1,2,3)]
b2 = b.data[SVector(4,5,6)]
#compute two combinations of cross products:
part1 = a1 × b1
part2 = a1 × b2 + a2 × b1
#concatenate the two parts to form a new 6-component vector:
return MyVectorType( (part1..., part2...) )
end
This function has been tested to make sure that it doesn’t allocate any memory:
#two random vectors:
a = MyVectorType(rand(6))
b = MyVectorType(rand(6))
using BenchmarkTools
@btime cross_product1(a,b);
5.357 ns (0 allocations: 0 bytes)
However, I have noted something very strange: if I remove the type annotations from the arguments in the function definition, the funtion now allocates memory!!
function cross_product2(a, b)
#split the vectors in two 3-component parts:
a1 = a.data[SVector(1,2,3)]
a2 = a.data[SVector(4,5,6)]
b1 = b.data[SVector(1,2,3)]
b2 = b.data[SVector(4,5,6)]
#compute two combinations of cross products:
part1 = a1 × b1
part2 = a1 × b2 + a2 × b1
#concatenate the two parts to form a new 6-component vector:
return MyVectorType( (part1..., part2...) )
end
@btime cross_product2(a,b);
20.113 ns (1 allocation: 64 bytes)
How can this possibly be?
The definitions of cross_product1
and cross_product2
are exactly the same, I just copy-pasted and removed the type annotations. And the function calls are also exactly the same, so the type annotations should not make a difference.
The amount of memory allocated is the same across many runs, so its not a compilation thing.
Can anyone explain this?