I’m highjacking this thread since we have a very similar problem and I don’t want to split the discussion up too much.
We run Sobol analysis on a large-ish ODE-model in parallel on a 24 core machine. The batch processing of the individual ODE solutions works great and scales wonderfully, but we found two issues/bottlenecks:
-
Memory usage goes up massively. I assume that’s because all the results get stored in memory for the Sobol index calculation to work on them most efficiently. For the moment I manage to get the case in 128G or 256G RAM with ZRAM compression (which doesn’t seem to cost too much overhead interestingly).
-
Only the ODE solvers run in parallel batch mode. On our models I found that thats about 80-90 percent of the workload, while 10-20% seems to be the Sobol index calculation. In practice that means that on a typical small problem on my laptop I have a 1:1 ratio between parallel and serial wall-clock time (toy test case just clocked up 13.6 CPU-minutes in 7 wall-clock minutes: 50 seconds wct parallel on 8 cores, and then another 7 minutes on a single thread. On the big machines, on a bigger problem, I had 1h wct on 24 cores and expect another 24h wct on a single core to get the estimated 48h CPU time (currently at 28h CPU time - so we’ll see tomorrow how good my estimates are and if it all scales as I naively expect).
Ad 2.: is there a way to run the Sobol analysis in parallel as well?
Ad 1.: is there a way to write the results object to disk and then have the Sobol analysis read that back in while it calculates the indices? I don’t know the algorithm, so don’t know if it is possible to sequentialise that enough to make that work. Or do I need to ask for a 1TB RAM machine?