Why is the memory blowing up in this multi-threaded code?

So you’re saying that if you look at htop for Julia and for Python/Numba you see that the memory use is higher for Julia? Or are you comparing htop for Python with @benchmark for Julia? because that makes no sense.

32GB is probably the total memory consumption for both, it’s just a question of how much is in use at any moment.

I’m looking at the RES memory in htop, for both. The memory use is higher for Julia. I can’t get the equivalent @benchmark stats for Numba.

I suspect it had something to do with objs.data. If I set dfs = objs.data at the start, and then iterate for df in dfs, I don’t see the memory jump multiples during the parallel loop.

That sounds likely. What sort of data structures are objs and objs.data? That has not been clear, really.