Multithreaded memory usage

I am fairly sure the OOM problem you observe can be avoid by changing the _reduce function to not spawn tasks recursively, but instead chunk the work first and then spawn tasks all at once (like I did in my previous example). Earlier you said you are willing to try this. Did it help anything?

I tested this again with the latest MWE you provided. Spawning recursively puts me at around ~30Gs when I run with 6 or 12 threads. Using the map approach from my other example above I only use ~11Gs which agrees with what the debug message predicts. And from the bit of testing I did I also conclude that the non-recursive version finishes much faster since it spawns all tasks at once and they can immediately do their work. Whereas in the recursive version task spawning is delayed quite a lot (tested by inserting a println call before @spawn).

None of this answers what causes your OOM crashes and the excessive memory use I see when running the recursive version. And whether it is a GC bug or a problem with @spawn. Or if there is some shared state/data race that we are all not seeing. (Regarding the latter there can be weird bugs with shared state due to Julia’s scoping rules, cf. Inconsistent results when using Threads.@threads in a loop - #2 by fatteneder).

EDIT: The previous answer was a reply to a wrong post and now I have to insert some more text to make the ‘similar reply’ check of discourse go away …

1 Like