How to force a small allocation?

A MWE:

function test()
    Bool[]
    return nothing
end

@time test()

results in

  0.000007 seconds (1 allocation: 64 bytes)

as desired in Julia 1.7.3. But this method doesn’t work in 1.10.9, resulting in no allocation. What’s the universal method?

You instantiate an empty vector and do nothing with it. I guess the 1.10 compiler optimizes the code better and thus doesn’t create the unused vector at all. Seems like a useful performance optimization to me.

But I don’t want it to be too smart, at least for my purpose.

You need to explain your purpose then, otherwise we can’t help you.

6 Likes

There are some general guidelines though, no? For example, if the vector is used in some way, then the vector has to be created:

f() = sum(rand(10))

I’d think this will always allocate.

1 Like

I don’t know the purpose, but this should work (it does for 1.11.5 at least):

function test()
     sizehint!(Bool[], 0)
    return nothing
end

@allocations test() # 1 allocation
1 Like

I want to know the recursion times of my recursive code, when using @btime to benchmark it.

What does this mean if it’s not directly profiling said code, and what does it have to do with forcing a Bool[] allocation to occur?

1 Like

I think you should give even more context. You can of course do something like

julia> function test()
           a = Bool[]
           GC.@preserve a ccall(:memchr, Ptr{Nothing}, (Ptr{Nothing}, Cint, Csize_t), pointer_from_objref(a), 0, 0)
           return nothing
       end

But that involves the full function call overhead of maybe ~15 cycles, and makes the compiler forget a lot of info (i.e. it doesn’t play well with surrounding LICM because the compiler must assume that the foreign call clobbers memory). So does this really reflect your real workload? We cannot help you with that without more context on what your real workloads are.

I mean, an obvious way to make sure the allocation isn’t removed is to actually use it, without doing anything else overcomplicated:

julia> test() = Bool[]
test (generic function with 1 method)

julia> @time test()
  0.000001 seconds (1 allocation: 32 bytes)
Bool[]

But I concur with the request above to explain better what you ultimately want to achieve and avoid XY problems.

1 Like

Thank you guys for helping. This is a more complete example:

function Fib(n)
    Bool[]
    if n==0 || n==1
        return 1
    else
        return Fib(n-1)+Fib(n-2)
    end
end

@time Fib(5)

gives the result

  0.000004 seconds (15 allocations: 960 bytes)
8

in Julia 1.7.3, which is what I want, but it doesn’t work in 1.10.9.

But that example is nonsense, you’re just abusing the global allocation counter for counting the number of invocations. Just do

const counter = Ref(0)
function Fib(n)
    counter[] += 1
    if n==0 || n==1
        return 1
    else
        return Fib(n-1)+Fib(n-2)
    end
end

tic = counter[];
 Fib(5)
toc = counter[]; @show toc - tic
3 Likes

But your method doesn’t work with @btime, right?

You can use the setup keyword to @btime to reset mutable state between runs, if that’s the issue.

If your goal is just to count calls, you could return the count of Fib as a second argument. The count of a particular call would be the sum of the counts of each of the two recursive calls (or 1 in the base case). This could count the calls using only local variables.

3 Likes

This can be optimised out, progress is certainly being made towards that happening if it hasn’t already e.g. https://github.com/JuliaLang/julia/pull/55913

2 Likes

Implementation:

julia> function fib(n)
           if (n == 0) || (n == 1)
               return (1, 1)
           end
           f1, c1 = fib(n - 1)
           f2, c2 = fib(n - 2)
           return (f1 + f2, c1 + c2 + 1)
       end
fib (generic function with 1 method)

julia> fib(5)
(8, 15)
3 Likes

The setup code is executed once per sample, but a sample may contain many evaluations. You have to add evals = 1 to @btime to get a single evaluation per sample. In my experience, this leads to quite inaccurate results for short runtimes.

Thanks a lot. This works well for me, just bringing ~250 ns overhead per call on my machine.

Keep in mind that the language doesn’t provide any guarantees about allocations. They can be optimized away, as you observed, but in other cases, there can be more of them than you expect. Even with @VinceNeede’s solution, this is hardly a reliable way of counting the number of function calls.

Other people have mentioned setup, but the easiest solution here is just to reset the counter as part of the benchmark:

julia> const counter = Ref(0);

julia> function fib(n)
           counter[] += 1
           if (n == 0) || (n == 1)
               return 1
           else
               return fib(n - 1) + fib(n - 2)
           end
       end
fib (generic function with 1 method)

julia> @btime begin
           $counter[] = 0
           f = fib(5)
           (f, $counter[])
       end
  11.261 ns (0 allocations: 0 bytes)
(8, 15)
2 Likes