liuyxpp
January 26, 2021, 12:58pm
1
I have a struct
struct Chain
N
x
y
z
end
where x
, y
, and z
are vectors of length N
.
I have a function to process a Chain
instance
function foo1(chain)
for i in 1:chain.N
for j in (i+1):chain.N
do_something_with_chain
end
end
end
When I benchmark its performance using BechmarkTools with @benchmark
, I see huge allocations. (N=200, allocations on the order of 200,000)
> chain = Chain(200, randn(200), randn(200), randn(200))
> @benchmark foo1($chain)
However, if I put chain in the body of the function as
function foo2()
chain = Chain(200, randn(200), randn(200), randn(200))
for i in 1:chain.N
for j in (i+1):chain.N
do_something_with_chain
end
end
end
when I benchmark it, foo2
has much fewer allocations (about 200) and much faster.
> @benchmark foo2()
My questions are:
Is this allocation issue connected to BenchmarkTools or passing around big structs?
If its the problem of passing around big structs, how should I pass them efficiently?
Thanks!
Can you give a real example of do_something_with_chain
? As written, it almost seems like the code could be optimized down to see that chain.N
is the only field accessed, but I imagine that’s not true in practice.
1 Like
liuyxpp:
struct Chain
N
x
y
z
end
Does the difference go away if you add type annotations to the fields of your struct? (You should do this anyway to get better performance.)
6 Likes
Sure, replacing do_something_with_chain with following code
p1 = [chain.x[i], chain.y[i], chain.z[i]]
p2 = [chain.x[j], chain.y[j], chain.z[j]]
if is_colliding(p1, p2, radius)
return false
end
And the related functions are
function is_colliding(p1, p2, radius)
d2 = distance2(p1, p2)
d2 < 4*radius*radius ? true : false
end
distance2(p) = sum(abs2, p)
distance2(p1, p2) = distance2(p1.-p2)
However, even do_something_with_chain does nothing, there are still 20,000 allocations.
I still don’t fully understand exactly what you’ve been benchmarked. Can you provide a single snippet of code that can be directly copied and pasted? That would make it easy to understand if @sostock ’s proposal solves everything.
This seems to solve my problem. Thanks!
1 Like
Like suggested by @sostock I should annotate my struct to avoid unnecessary allocations, like this
struct Chain{T<:Real}
N::Int
x::Vectors{T}
y::Vectors{T}
z::Vectors{T}
end
1 Like