Recommended way to set up global buffers

Say my package needs some global buffers to avoid allocations of many small vectors of a fixed type but with variable length (with benchmark results showing that this gives indeed a nice speedup). This could look as follows:

const BUFFER = Vector{Vector{Int}}()
const BUFFER_LENGTH = 64

function __init__()
    Threads.resize_nthreads!(BUFFER, Vector{Int}(undef, BUFFER_LENGTH))
end

function do_some_work(args...)
    if some_length <= BUFFER_LENGTH
        buffer = BUFFER[Threads.threadid()]
    else
        buffer = # allocate
    end
    # use buffer etc.
end

This worked with static scheduling in the past. Now, we have two more challenges:

  • Dynamic scheduling of threads
  • Precompilation with executing work such as SnoopPrecompile.jl does not see buffers set up in __init__.

What is currently the best way to set up some global buffers for a case like this?

While I’m not certain about how to deal with precompilation, I think the dynamic scheduling issue might be worked around by using a Channel instead of a Vector of vectors. The difference between a channel and a vector in this context is that the element is removed from the collection in the former, so two tasks can’t access the same buffer. Once a task completes, it may return its buffer back to the pool.

Thanks! I may look into that. But first, I will wait to see if there is a nice way solving the precompilation problem, too.