Way to return the number of allocations?


#1

Is there a way to return the number of allocations used in a code? For example, I want to get the number 5 of the 5 allocations in the following example:

julia> using BenchmarkTools

julia> @btime [1,1,1]+[2,2,2]+[3,3,3];
  193.087 ns (5 allocations: 560 bytes)

The purpose of this is to create a unit test that checks the number of allocations consumed in a specific function. This is a performance-critical function, so I want to keep the number of allocations minimal, but sometimes updating this function accidentally increases the number of allocations. I want to detect such a case automatically by writing a unit test like

@test @numallocs(myfunction()) ≤ 2

#2

Try this:

macro numallocs(expr)
    return quote
        n1 = Base.gc_num()
        $(esc(expr))
        n2 = Base.gc_num()
        diff = Base.GC_Diff(n2, n1)
        Base.gc_alloc_count(diff)
    end
end

#3

Hmm BenchmarkTools compute this like: https://github.com/JuliaCI/BenchmarkTools.jl/blob/1238af0326a13f44c2121a017a13e7ef327bddf6/src/execution.jl#L306. Not sure if that is more accurate.


#4

@timed should give you that:

z = @timed sin.([x for x in 1:1000]);
z[1] # result
z[2] # elapsed time
z[3] # total number of bytes allocated
...

#5

Thanks, @kristoffer.carlsson and @sbromberger! The methods suggested by you two return larger numbers of allocations than @btime. However, from the BenchmarkTools code that @kristoffer.carlsson referred to, it was easy to create the wanted macro by copying the @btime code:

macro nallocs(args...)
    _, params = BenchmarkTools.prunekwargs(args...)
    quote
        tmp = @benchmarkable $(map(esc,args)...)
        warmup(tmp)
        $(BenchmarkTools.hasevals(params) ? :() : :(BenchmarkTools.tune!(tmp)))
        b, val = BenchmarkTools.run_result(tmp)
        bmin = minimum(b)
        a = allocs(bmin)
    end
end

#6

You need to precompile the function first by running it once, then run @timed. @btime does this for you.


#7

I expected so, too, but running @timed [1,1,1]+[2,2,2]+[3,3,3] the second time still reports 9 allocations, whereas @btime reports 5 allocations. To me 5 allocations makes more sense, because it takes one allocation for each array construction and one for each addition. Provided that 5 allocations is the correct result, is there a way to make @timed to report 5 allocations for this code?


#8

@time agrees (perhaps not surprisingly) with @timed. This is a simple function; perhaps someone who’s better at this sort of thing can take a look at @code_llvm or @code_native and see what the correct value is.


#9

If you just want to know the number of allocations, why would you run a benchmark on the function, calling it potentially thousand of times?


#10

Good point. I would love to know how to use one of the above suggested methods to get the exactly same number of allocations as @btime while running the code only twice internally.

Plus, I need a capability to interpolate variables using $ (which is supported by @btime), because the functions I would like to test take arguments and I don’t want to count the number of allocations consumed in constructing those arguments. Not sure which part of the @btime code handles interpolation…

EDIT: OK, using @nallocs slows down the unit test significantly. I definitely need a better solution! Any suggestions?


#11

You can eliminate some extra allocations (“effects of compilation”) by wrapping Kristoffer’s solution (in the manner of @allocated in Base):

macro numalloc(expr)
    return quote
        let
            local f
            function f()
                n1 = Base.gc_num()
                $(esc(expr))
                n2 = Base.gc_num()
                diff = Base.GC_Diff(n2,n1)
                Base.gc_alloc_count(diff)
            end
            f()
        end
    end
end

Handling interpolations looks to be nontrivial.