Preallocated variable length buffer with known maximum length

In this code snippet:

n = 9
buf = Int[]
sizehint!(buf, n)

function foo(buf)
  empty!(buf)

  for i in 1:rand(1:n)
    push!(buf, rand(1:n))
  end
end

I intended to preallocate buf, so it foo(buf) can be called without further allocations. I failed because:

julia> @btime foo($buf)
  367.466 ns (10 allocations: 350 bytes)

Is there a solution to this problem?

EDIT

This is a closer version of foo to what I’m working with:

function foo(buf)
  empty!(buf)

  while length(buf) < rand(1:n)
    push!(buf, rand(1:n))
  end
end

The length of buf is unpredictable, but not exceeding n.

I think you can use resize! and then normal setindex syntax instead of push, which has more complex logic (because you deliberately set the hinted size, which push doesn’t know and has to check every time)

It doesn’t help:

function foo1(buf)
  j = rand(1:n)
  resize!(buf, j)

  for i in 1:j
    buf[i] = rand(1:n)
  end
end

still allocates:

julia> @btime foo1($buf)
  370.968 ns (10 allocations: 339 bytes)

Additionally, earlier version of foo was a bit too simplistic. I don’t really know the new buffer size at the beginning as it suggested. This is closer to the reality of my algorithm:

function foo(buf)
  empty!(buf)

  while length(buf) < rand(1:n)
    push!(buf, rand(1:n))
  end
end

Don’t use n as global variable

3 Likes

Thank you:

julia> @btime foo($buf)
  75.193 ns (0 allocations: 0 bytes)

Now I’m scratching my head trying to figure out why… Is there any other reason for compiler to call allocate than pushing to array, which already used all preallocated memory?

Another option to avoid allocation is to parametrize foo with n:

function foo(buf, n)
  empty!(buf)

  while length(buf) < rand(1:n)
    push!(buf, rand(1:n))
  end
end

Is the compiler trying to be too smart for its own good?

No :slight_smile:

1 Like

Why would compiler care what is inside getnum()? Can it just check if allocation is needed inside push call?

n = 9
buf = Int64[]
sizehint!(buf, n)

getnum() = rand(1:n)

function bar(buf)
  empty!(buf)

  while length(buf) < getnum()
    push!(buf, getnum())
  end
end
julia> @btime bar($buf)
  268.148 ns (7 allocations: 234 bytes)

I think the problem is that if n is a global and not a const, the getnum function is not guaranteed to return an int because it needs to be prepared for n to be mutated to anything else. That’s why getnum is not type stable and to deal with that some helper data structures that cause allocations are needed

2 Likes

Because you run that function so the compiler has to care?

1 Like

Changing it to:

const n = 9

still causes allocations.

My point is that foo and bar can be executed without needing allocations. Why is that for some external conditions compiler produces allocations and for others it doesn’t? The control and data flow are identical with and without allocations.

Also:

n = 9

function baz()
  println(typeof(rand(1:n)))
end

julia> baz()
Int64

so the type seems to be stable. I also tried:

getnum()::Int64 = rand(1:n)

and allocations are still there.

I meant, why compiler analysis of getnum() determines whether allocation will be performed? I see that unknown return type of getnum as mentioned by @jules could necessitate that, but it is not the case, is it?

Because if things are global then certain optimizations are not available, values might have to be boxed etc.

Not for me:

julia> @time bar(buf) # compilation allocations
  0.058862 seconds (65.22 k allocations: 3.386 MiB, 9.11% gc time)

julia> @time bar(buf)
  0.000004 seconds

What you did there with the typeof does nothing for checking type stability. Use @code_warntype. Also, a type assert doesn’t help with the fact that n is a global that you use.

It really seems you should read https://docs.julialang.org/en/v1.5/manual/performance-tips/ carefully. Counting the exact number of allocations in type unstable code is pretty pointless, it’s mostly counting implementaion details.

3 Likes

Thanks, now it makes sense:

julia> @code_warntype foo(buf)
Variables
  #self#::Core.Compiler.Const(foo, false)
  buf::Array{Int64,1}

Body::Nothing
1 ─      Main.empty!(buf)
2 ┄ %2 = Main.length(buf)::Int64
│   %3 = (1:Main.n)::Any
│   %4 = Main.rand(%3)::Any
│   %5 = (%2 < %4)::Any
└──      goto #4 if not %5
3 ─ %7 = (1:Main.n)::Any
│   %8 = Main.rand(%7)::Any
│        Main.push!(buf, %8)
└──      goto #2
4 ─      return

Globals can be so evil!

2 Likes