Allocations in advent of code

I was fiddling around with the super fun advent of code this year and got surprised by an allocation in the day 5 puzzle.

input = [3,4,3,1,2]
histogram = [count(==(i),input) for i in 0:8]
phase(day) = mod(day-1,9)+1
@allocated for day in 1:256
   histogram[phase(day-2)] += histogram[phase(day)]
end
sum(histogram)

@time says there are 533 allocations (8.328 KiB) in the loop. Why is that? Or have I just benchmarked poorly?

Did you put that code into a function? (see Performance Tips Β· The Julia Language)

If I put this in a function and @time it, I get:

0.000004 seconds (2 allocations: 224 bytes)

and that’s just the two allocations for input and histogram.

3 Likes

I tried to just put the loop in the function, and I was still getting issues. Putting the whole thing in a function gives 2 allocations, as you say.

Did you pass inputs and histograms as arguments to your function, or did you access them as global variables inside that function? This makes a big difference (the latter is much less performant), as also discussed in the page that I linked

3 Likes

For quick stuff like this, you can use let blocks, or wrap everything in functions, like:

let
    #stuff
end

function main()
    #stuff
end
main() # call function

The later is very nice if you use Revise and include the file with includet, such that in the REPL you can continuously just run julia> main() while changing the content of the function.

2 Likes

I passed it in as an argument. There were still lots (hundreds) of allocations when I did that. What’s the right way to isolate this?

Some alternatives:

  1. let block
julia> using BenchmarkTools

julia> @ballocated let 
       input = [3,4,3,1,2]
       histogram = [count(==(i),input) for i in 0:8]
       phase(day) = mod(day-1,9)+1
       for day in 1:256
          histogram[phase(day-2)] += histogram[phase(day)]
       end
       sum(histogram)
       end
224

  1. everthing in a function
julia> function main() 
       input = [3,4,3,1,2]
       histogram = [count(==(i),input) for i in 0:8]
       phase(day) = mod(day-1,9)+1
       for day in 1:256
          histogram[phase(day-2)] += histogram[phase(day)]
       end
       sum(histogram)
       end
       @ballocated main()
224
  1. the data outside the function, but as parameters:
julia> input = [3,4,3,1,2]
       histogram = [count(==(i),input) for i in 0:8]
       phase(day) = mod(day-1,9)+1
       function main(input,histogram)
       for day in 1:256
          histogram[phase(day-2)] += histogram[phase(day)]
       end
       sum(histogram)
       end
       @ballocated main($input,$histogram) # interpolate the input parameters! 
0

always using @ballocated from BenchmarkTools, to avoid computing compilation, etc.

2 Likes

Interpolating the inputs must have been the key! Does that work with @allocated as well, or only @ballocated?

@allocated does not need that, but it has its quirks. First, it will count all stuff associated to the compilation on the first call to the function. Second, it may report allocations associated to the returning of the value from the function to the REPL. Thus, for example:

julia> f(x) = sum(x)
f (generic function with 1 method)

julia> x = rand(10);

julia> @allocated f(x)
4026434

julia> @allocated f(x)
16

On the first call it counted everything associated to compilation. In the second call it still reports 16 allocations, which are associated to the return value to the REPL. These are correctly discounted with @ballocated (properly used, with the variable interpolated):

julia> @ballocated f($x)
0
1 Like

I like to use TimerOutputs to track allocations in cases like this one. Here is how it could be used on your example:

using TimerOutputs

# global TimerOutput object
const to = TimerOutput()

# wrap everything in a function (benchmarking in the global scope is tricky)
function foo(input)
    phase(day) = mod(day-1,9)+1

    # annotate lines or code sections  with @timeit
    @timeit to "histogram" histogram = [count(==(i),input) for i in 0:8]
    @timeit to "for" for day in 1:256
        histogram[phase(day-2)] += histogram[phase(day)]
    end
    @timeit to "sum" sum(histogram)
end

# Make sure everything is compiled
input = [3,4,3,1,2]
foo(input)

# Reset the timer and re-run to get meaningful results
reset_timer!(to)
foo(input)
print_timer(to)

The print_timer invocation above yields an output like

julia> print_timer(to)
 ────────────────────────────────────────────────────────────────────
                             Time                   Allocations      
                     ──────────────────────   ───────────────────────
  Tot / % measured:      1.17ms / 0.27%           41.0KiB / 0.31%    

 Section     ncalls     time   %tot     avg     alloc   %tot      avg
 ────────────────────────────────────────────────────────────────────
 for              1   1.86ΞΌs  59.9%  1.86ΞΌs     0.00B  0.00%    0.00B
 histogram        1   1.01ΞΌs  32.3%  1.01ΞΌs      128B  100%      128B
 sum              1    241ns  7.74%   241ns     0.00B  0.00%    0.00B
 ────────────────────────────────────────────────────────────────────

which helps you see that the only allocations come from instantiating the histogram (which is expected)

3 Likes