OOM Killer on Linux

I’m curious if anyone else has run into this.

I’m on Ubuntu and trying to run an analysis on some really large datasets (40gigs binary, 1.5billion rows, 1600 chunks) using JuliaDB. I’ve frequently run into the issue where I can’t utilize all 8 threads on my CPU because I run out of memory before then - Linux kills off the processes as that happens, and then the whole execution fails. The only solution I have is to limit myself to only 2 or 3 threads, maybe 4 if I feel lucky, or to chunk out even smaller.

Is there a solution to throttling the processes as I run out of memory rather than killing off a process 70% of the way through a 30-minute-long script, so that it finishes?

I think not. You may try increasing the amount if swap available.

pmap is fault tolerant, at least when working with addproc processes (not sure about when using threads). Fault tolerant `pmap` when worker goes down

Ooooh… as @zgornel points out I dont think there is a throttling mechanism.

I would start to witter on about containers, which depend on cgroups.
But I don’t think this will help https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt

Linux specific warning. However at one stage when working with large simulations I implemented ZRAM. This works in a kind of counter intuitive fashion - you allocate some RAM as swap space BUT you compress this. So if your data is compressible you effectively get more RAM.

Give me a message and I can try to help you set it up.

there is the freezer with cgroups

But that is not a gentle throttle. don’t see much point in putting a high memory using process in the freezer. When you thaw it out it will only continue to grow.

The real issue here is JuliaDB’s large memory footprint, IMO. It should be possible to process arbitrarily large datasets with JuliaDB with a fixed RAM budget (at the cost of lost performance, but there’s no avoiding that).

Trying to throttle a process in the hopes that it will cause it (and its friends) to allocate less seems like a poor solution to the problem; ZRAM or swap is probably the better option in the short term.

1 Like