Passing around large structs efficiently

I have a struct

struct Chain
	N
	x
	y
	z
end

where x, y, and z are vectors of length N.

I have a function to process a Chain instance

function foo1(chain)
    for i in 1:chain.N
        for j in (i+1):chain.N
	        do_something_with_chain
	    end
    end
end

When I benchmark its performance using BechmarkTools with @benchmark, I see huge allocations. (N=200, allocations on the order of 200,000)

> chain = Chain(200, randn(200), randn(200), randn(200))
> @benchmark foo1($chain)

However, if I put chain in the body of the function as

function foo2()
    chain = Chain(200, randn(200), randn(200), randn(200))
    for i in 1:chain.N
        for j in (i+1):chain.N
	        do_something_with_chain
	    end
    end
end

when I benchmark it, foo2 has much fewer allocations (about 200) and much faster.

> @benchmark foo2()

My questions are:

  1. Is this allocation issue connected to BenchmarkTools or passing around big structs?
  2. If its the problem of passing around big structs, how should I pass them efficiently?

Thanks!

Can you give a real example of do_something_with_chain? As written, it almost seems like the code could be optimized down to see that chain.N is the only field accessed, but I imagine that’s not true in practice.

1 Like

Does the difference go away if you add type annotations to the fields of your struct? (You should do this anyway to get better performance.)

6 Likes

Sure, replacing do_something_with_chain with following code

p1 = [chain.x[i], chain.y[i], chain.z[i]]
p2 = [chain.x[j], chain.y[j], chain.z[j]]
if is_colliding(p1, p2, radius)
	return false
end

And the related functions are

function is_colliding(p1, p2, radius)
    d2 = distance2(p1, p2)
    d2 < 4*radius*radius ? true : false
end

distance2(p) = sum(abs2, p)
distance2(p1, p2) = distance2(p1.-p2)

However, even do_something_with_chain does nothing, there are still 20,000 allocations.

I still don’t fully understand exactly what you’ve been benchmarked. Can you provide a single snippet of code that can be directly copied and pasted? That would make it easy to understand if @sostock’s proposal solves everything.

This seems to solve my problem. Thanks!

1 Like

Like suggested by @sostock I should annotate my struct to avoid unnecessary allocations, like this

struct Chain{T<:Real}
    N::Int
    x::Vectors{T}
    y::Vectors{T}
    z::Vectors{T}
end
1 Like