Improved allocation design, with 4-byte pointers, and sometimes 5-byte in effect

Palli · December 1, 2024, 9:56pm

It’s similar but no, not just arenas. I do not claim full originality, basing my ideas on arena allocators, and how databases allocate in page-sized chunks. My SIMDString idea is similar to rows in a database, sharing a page, and arenas aren’t about fewer pointer if I understand correctly.

Note, you DO have arenas with Bumper.jl already, and I’m pretty sure with it (or similar techniques) Julia can be as fast as any language, C/C++, Rust and Java. That said it’s slower here, and I’m motivated by that worst-case outlier for Julia:

As I understand arenas in the video and all I know about such allocators, yes, they can be fast, but you have to opt into them, in e.g. C and Julia. And as explained in the video you can have many of them, and release all at once, for each one (but you need to do that, or be ok with that memory leak). But that’s based on an arena ID. And then you have an offset into your arena. So you have likely have at a minimum 4 + 4 bytes, to point to memory within such an arena (possibly only 4 bytes?), though more likely 16 bytes? In my idea the pointer would be 4 bytes, no need for an arena ID. And if you choose to have an offset into your block as I explained, if could be as little as 1 byte, and still be useful.

The rules at the Benchmark Game rule out using arenas for CG-languages (at least for some code, i.e. there).

It seems unfair, if you want to show what is the “fastest language” but I believe the point with the benchmark is to show what speed you get out of the box for GC languages (without tuning the GC allocator). That’s of course very unfair, since shown for C, C++ or Rust isn’t want you get by default, only for highly tuned code, opting into arenas. If might be implicit that arenas are much used such languages (but not GC languages?), and true, that might be the culture, for e.g. game code. I’m just unsure about most code.

The video is interesting, claiming manual memory allocation isn’t that hard, or needs to be, though it IS strictly speaking a bit harder if you must think of arenas, and opt into them. At least vs GC languages.

I’m thinking what can be had automatically, sort-of arenas behind the scenes, without the programmer needing to know, some or many scientists might not know of or care about arenas. You also need to think of how composable they are, if you destroy an arena, how does that work? For games they are ok, with e.g. arrays of structs of bittypes, but in general pointers from it could point outside of your arenas(?) to other arenas or just to strings.

If your arena points to another arena (would you?) then this gets complicated. If you have even just a huge array of strings (in your arena), they all point to the heap. In C and C++ if you release your arena, you must release all you strings, and it’s an O(n) operation, i.e. slow. A garbage collector CAN actually be better for hard real-time, it could lazily release off-arena dependencies. To be fair in robotics or games you would just not release, or do it outside of your hot loop.

Topic		Replies	Views
Julia GC, heap fragmentation, out of memory, push!/append! Internals & Design	30	2599	May 20, 2024
What exactly is "allocation" in Julia? Performance question , memory-allocation	45	6076	November 4, 2022
GC occurs at the worst time in tight loop (Garbage Collection) Performance question	93	3318	November 7, 2023
Julia position in the Debian Benchmark Game can be improved, and categorization of some Julia there is unfair Performance	29	1898	December 12, 2024
C struct garbage collection not run frequently enough General Usage garbage-collection , mutable-structure , c , gc	28	442	July 14, 2024

Improved allocation design, with 4-byte pointers, and sometimes 5-byte in effect

Related topics