Reusing temporary variables (ed: Object Pools)

cortner · May 24, 2021, 3:44pm

In most of my producton codes I am very cautious about making performance critical parts non-allocating and passing in temporary variables (e.g. large arrays, NamedTuples of arrays, etc). This results in code of the form

myf(model, x) = myf!(alloc_tmp(model, x), model, x) 
function myf!(tmp, model::MyModel, x)
   # implement the model
end

This is mostly fine, but it results in quite a bit of manual extra code management, and as the complexity of the codebase grows I am running into more and more difficult edge cases.

Are there any code patterns or packages that help manage and simplify this kind of strategy? E.g. I’m thinking of some kind of “heap” of allocated temporaries where a function can retrieve them when called again.

Apologies if the question is a bit vague - I’m not 100% certain what I’m looking for.

Jeff_Emanuel · May 24, 2021, 4:15pm

You are probably looking for Object pool pattern - Wikipedia

cortner · May 24, 2021, 4:21pm

yes - that sounds exactly right. Has anybody experimented with this in Julia?

Can I think of it a little bit has hacking the garbage collector?

cortner · May 24, 2021, 4:22pm

I see

but i says “experimental”. Any experience with this or similar packages

Jeff_Emanuel · May 24, 2021, 4:25pm

It’s not hacking the garbage collector, but circumventing it with ostensibly more efficient management. I’d use an object pool only as a last resort. The advantage over your current practice is that it isolates all the management to one point of contact.

cortner · May 24, 2021, 4:31pm

Thanks for your comments. You say “as a last resort”, but also that this has an advantage over my current practise, though I didn’t understand your point about “isolating all the management to one point of contact”.

“Morally” a pool and what I do seem very similar/same, but it feels like I’m manually implementing something like an object pool again and again and again.

Jeff_Emanuel · May 24, 2021, 4:34pm

I say that partly because of the corner-case problems you are encountering or are yet to encounter, and because it may not out perform GC.

Yes, the single point of contact is that the code is implemented once and reusable, and you can encapsulate all the corner case handling, such as thread safety.

cortner · May 24, 2021, 4:43pm

Thanks - I appreciate your thoughts - that really helps.

I’m still hoping somebody will tell me about practical experience with a julia package. ??!!

cortner · May 24, 2021, 7:45pm

@Tamas_Papp can you comment why your package isn’t registered?

ettersi · May 25, 2021, 12:52am

Can you share some of these difficult edge cases? For the most part, it seems that all you have to do is write a non-allocating function foo!() and then add an allocating version foo() which is usually a one-liner.

cortner · May 25, 2021, 1:16am

Almost always related to AD - when the input is e.g. a vector of duals, I have to come up with the derived types needed for temporary arrays. I have no MWE I’m afraid. So as we are starting to switch to ChainRules this may well never be a problem anymore.

Sometime things also get weird when the temporary variables depend too much on the input.

Note it is not quite as easy as having an allocating and a non-allocating version. The issue is that the allocation code must normally be called by an outer routine that makes many calls to myf!. Hence the “interface” must be reasonably generic. A Pool would completely solve that problem and get rid of the need for a generic interface for alloc_temp

ettersi · May 25, 2021, 1:52am

I remember reading somewhere on this forum that the garbage collector already does some form of memory pooling, i.e. if you allocate and free a large number of equally sized arrays, then the garbage collector is smart enough to just reuse the same chunk of memory.

We can test this hypothesis in two ways. First, we can simply allocate and free a large number of equally sized arrays and take a look at the memory addresses:

julia> a = [pointer(Vector{Int}(undef, 100_000)) for i = 1:10_000]
       length(unique(a))
532  # <- Much less than 10_000!

julia> a = [pointer(Vector{Int}(undef, 1_000_000)) for i = 1:10_000]
       length(unique(a))
56  # <- Even fewer if we increase the array size.

Second, we can check whether allocating and freeing equally sized arrays is actually faster than allocating and freeing varyingly sized arrays:

# Constant size
julia> @btime for i = 1:10_000; Vector{Float64}(undef, 55_000); end
  7.807 ms (20000 allocations: 4.10 GiB)

# Varying size
julia> @btime for i = 1:10_000; Vector{Float64}(undef, rand(10_000:100_000)); end
  13.001 ms (20000 allocations: 4.05 GiB)

Both of these experiments indicate that the vanilla garbage collector indeed does some form of memory pooling. It may of course still be possible to improve on this by writing your own memory manager which exploits domain-specific information, but doing so is quite a bit of work, incurs a high risk of introducing subtle bugs, and probably requires a lot of hand-tuning to really be worthwhile. So all in all, I’d say your options are 1) write your code to be allocating and trust that the garbage collector will handle temporary memory for you, or 2) write your code to be non-allocating and eliminate all the guesswork regarding whether or not memory management will be a performance bottleneck in your application.

cortner · May 25, 2021, 3:39am

so all things considered would you revise your thoughts on ZuLIP and conclude it may just not be worth it worrying about allocations?

ettersi · May 25, 2021, 4:02am

I guess it’s a trade-off between having ultimate control of what’s going on in your code and ease of coding. I don’t have much experience in how these two aspects play out in your application. In linear algebra, writing non-allocating code is mostly a question of quickly checking what temporaries you actually need and then following some simple design patterns to separate all the allocations. For the purpose of automatic differentiation, I could imagine things get more complicated, but maybe we need some concrete examples to get to the bottom of this.

Tamas_Papp · May 25, 2021, 6:14am

In the end I rewrote the code so that I didn’t end up using this approach. I still think it is viable but I expect corner cases will turn up with use that would need some careful handling. If someone wants to follow up and contribute to this approach, I am happy to register it, but can’t provide support at the moment.

cortner · May 25, 2021, 12:57pm

Thank you for the explanation

halleysfifthinc · May 25, 2021, 1:18pm

I explored a similar concept in this thread and you can find the code I used at GitHub - halleysfifthinc/SafeBuffers.jl: Concurrency/multi-threading safe pre-allocated mutable buffers (e.g. arrays, etc.)

It is usable as a package but is not registered nor do I have any plans at the moment to register/maintain it. That said, it solved my problem. After a quick perusal of Tamas’ ObjectPools.jl, two main differences are that my interface is thread safe and that the type of the pool (e.g. Ref’s would be valid, etc) isn’t restricted.

Topic		Replies	Views
[ANN] ObjectPools.jl Package Announcements	0	326	May 8, 2023
Can I manage the memory by myself? General Usage memory-allocation , garbage-collection , gc	16	2267	February 1, 2025
How to support passing temporary buffers around Internals & Design memory-allocation , functions	23	1723	March 31, 2022
Managing temporaries General Usage performance , array	1	344	April 2, 2022
Loops, allocations and helping the coder Internals & Design	26	2120	October 10, 2023

Reusing temporary variables (ed: Object Pools)

Related topics