Hi, I have some code that can do some performance optimizations provided that an allocation of some matrix NxN is possible. If N is too large, I get an out of memory error. In this case I can use an alternative algorithm that doesn’t do this allocation but is slower.
Is there a way to programmatically check if such an allocation is possible? So that my algorithm doesn’t throw an error to the user but automatically switches to the alternative version.
Also, is what I want to do in general safe/advised?
There’s not really a reliable, cross-platform way to do this, because of Linux:
Linux malloc doesn’t guarantee that you can actually use the allocated memory without triggering the OOM killer
Even if you malloc and touch every page of memory to ensure it’s currently available, the OOM killer could later reap your process in a low-memory situation caused by your process, another process, or the kernel allocating memory
Still, it’d be a fine strategy to allocate memory and then touch (read or write) the first byte of every 4KB block of the allocation to force the OS to materialize the whole allocation. If this doesn’t trigger the OOM killer, then in the limited case where the kernel and any user application don’t allocate more memory, you’re “safe” to use that allocation.
@jpsamaroo Good reply. I have delved into the OOM killer quite a lot in the past.
Stupid response from me - what if you disable memory overcommit on Linux? does this give you a more determinable OOM situation?
ps. We should also consider running the code in a cgroup here - surely you know how much memory you can allocate if the cgroup is given an amount of memory?
I suspect disabling memory overcommit would break other applications that depend on it, but it’s worth a try if it’s critical to avoiding OOMs for this specific application.
Note: I’m not an expert with cgroups, so this is based on how I assume cgroups work:
I don’t think a cgroup would save you here; a memory cgroup only says “invoke the OOM killer if the applications in the container exceed N bytes of memory allocated”, but it doesn’t then protect those applications from being reaped if an application outside of the cgroup triggers the OOM killer, or if the kernel somehow allocates a lot of memory.
This is exactly what try catch was invented for, so I don’t think you should shy away from it. What I think you want to avoid is throwing tons of errors as part of your basic flow control design. Like instead of using if statements.
try/catch is designed for trying something and falling back to some other code path if the first breaks horribly/irrecoverably (e.g. your memory limitation, where you can’t continue on the first path at all if it fails). If you only ever hit the try path, you’ll barely notice at all. Hitting the catch path means unwinding the stack though, which is expensive, hence not advised to happen often and/or in a hot code path.
It’s generally a question of what you consider “exceptional”, but you’ll want to have it such that in the vast majority of cases, the try path works.