Passing around large structs efficiently

I have a struct

struct Chain
	N
	x
	y
	z
end

where x, y, and z are vectors of length N.

I have a function to process a Chain instance

function foo1(chain)
    for i in 1:chain.N
        for j in (i+1):chain.N
	        do_something_with_chain
	    end
    end
end

When I benchmark its performance using BechmarkTools with @benchmark, I see huge allocations. (N=200, allocations on the order of 200,000)

> chain = Chain(200, randn(200), randn(200), randn(200))
> @benchmark foo1($chain)

However, if I put chain in the body of the function as

function foo2()
    chain = Chain(200, randn(200), randn(200), randn(200))
    for i in 1:chain.N
        for j in (i+1):chain.N
	        do_something_with_chain
	    end
    end
end

when I benchmark it, foo2 has much fewer allocations (about 200) and much faster.

> @benchmark foo2()

My questions are:

  1. Is this allocation issue connected to BenchmarkTools or passing around big structs?
  2. If its the problem of passing around big structs, how should I pass them efficiently?

Thanks!

Can you give a real example of do_something_with_chain? As written, it almost seems like the code could be optimized down to see that chain.N is the only field accessed, but I imagine that’s not true in practice.

Does the difference go away if you add type annotations to the fields of your struct? (You should do this anyway to get better performance.)

Sure, replacing do_something_with_chain with following code

p1 = [chain.x[i], chain.y[i], chain.z[i]]
p2 = [chain.x[j], chain.y[j], chain.z[j]]
if is_colliding(p1, p2, radius)
	return false
end

And the related functions are

function is_colliding(p1, p2, radius)
    d2 = distance2(p1, p2)
    d2 < 4*radius*radius ? true : false
end

distance2(p) = sum(abs2, p)
distance2(p1, p2) = distance2(p1.-p2)

However, even do_something_with_chain does nothing, there are still 20,000 allocations.

I still don’t fully understand exactly what you’ve been benchmarked. Can you provide a single snippet of code that can be directly copied and pasted? That would make it easy to understand if @sostock’s proposal solves everything.

This seems to solve my problem. Thanks!

Like suggested by @sostock I should annotate my struct to avoid unnecessary allocations, like this

struct Chain{T<:Real}
    N::Int
    x::Vectors{T}
    y::Vectors{T}
    z::Vectors{T}
end