Allocations when moving loop into a function

omicron8 · June 1, 2025, 6:19pm

Why does the first variant (function foo) allocate a large amount of memory while the second variant (function bar) does not? The difference is whether the loop is inside or outside the function. Is there a way to have the loop outside without so many allocations?

using BenchmarkTools

function foo!(F, A)
  F[1] = A[2]
  F[2] = - A[1]
  return nothing
end

function bar!(F, A)
  for i = 1:size(A,1)
    F[i,1] = A[i,2]
    F[i,2] = -A[i,1]
  end
  return nothing
end

n = 15_000
A = rand(n,2)
F = similar(A)

@btime begin
  for i = 1:n
    @views foo!((F[i, :]), (A[i, :]))
  end
end

@btime begin
  bar!(F, A)
end

Reults:
6.409 ms (161935 allocations: 3.62 MiB)
4.062 μs (0 allocations: 0 bytes)

eldee · June 1, 2025, 7:15pm

Hi, and welcome to the Julia community!

The problem is that in

@btime begin
  for i = 1:n
    @views foo!((F[i, :]), (A[i, :]))
  end
end

n is a (non-const, non-typed) global variable, meaning the compiler doesn’t know its type, nor that of i. So in every iteration we need to check the types involved. The same is true for A and F. If you declare them all const (note that this still allows in-place mutation of the Arrays), I get

julia> @btime begin
  for i = 1:n
    @views foo!((F[i, :]), (A[i, :]))
  end
end
  13.000 μs (0 allocations: 0 bytes)

Alternatively, BenchmarkTools.jl also allows for interpolating global variables using $:

julia> ... # non-const n, A, F

julia> @btime begin
           for i = 1:$n
               @views foo!(($F[i, :]), ($A[i, :]))
             end
         end
  11.000 μs (0 allocations: 0 bytes)

I’m not sure why we don’t need such interpolation for bar!, though. But even if we did, we only need to determine the types of F and A once, instead of once in every iteration, so the timing and allocation difference between the interpolated and non-interplated version would be much less pronounced.

omicron8 · June 1, 2025, 7:49pm

Thanks a lot!

nsajko · June 1, 2025, 10:04pm

Keep in mind you probably want everything where performance matters to be in a function, because it’s not compiled otherwise. See the Performance tips for more such suggestions:

Performance Tips · The Julia Language

Actually, maybe even take a look at the in-development version of the Performance tips, as they have a better structure:

Performance Tips · The Julia Language

Topic		Replies	Views
Allocations (again...) Performance question , allocations	15	671	November 25, 2022
Number of allocations New to Julia	1	625	February 7, 2020
Is simply accessing an array element really allocating? (Solved) New to Julia	8	1096	January 31, 2019
Spurious memory allocations within function Performance question , memory-allocation	9	435	March 22, 2023
What does allocation mean when creating a new array New to Julia	3	1360	December 30, 2018

Allocations when moving loop into a function

Related topics