Multi-core parallel support for JuMP supported solvers?

Okay, I’m reporting back here: we got about 1.4x speed up on the HPC using this multi-threaded SCS. I appreciate the help!

Looking forward to hearing about how to use the new distributed capabilities when you get the chance!

@charshaw SCS is not a distributed solver. As indicated above there are only two places where it can benefit from multithreading (openmp) and the speedup will be very much problem dependent.

You’re probably thinking of MPI-enabled solver, but this is not what SCS supports. Maybe @odow knows if any solver wrapped in MOI can use MPI?

Maybe @odow knows if any solver wrapped in MOI can use MPI?

Nope. Again, trying to solve this on some sort of HPC is the wrong approach. You can scale to a few cores of a laptop or workstation.

2 Likes

I concur :wink: the only benefits of my remote workstation over my laptop are that I don’t hear it :smiley: and I can run a problem and disconnect/close my laptop/etc and still get a solution in a few hours.

Tbh since I got a laptop with modern Ryzen the workstation is noticeably slower :wink:

@charshaw I wanted to ask this earlier: are the sd problems really that random, or do they have some internal structure? In such cases you’re always better using sparsity (chordal decomposition) and/or symmetry (Wedderburn + orbit decomposition) than throwing more computational resources on the problem.

2 Likes

Thanks again for your help @odow and @abulak ! I really appreciate it. It’s brought down the run time to less than 24 hours, which is very helpful for our cluster scheduler.

I’ve accepted @abulak 's answer for using multi-threading, because I think it addresses what I was getting at!

You’re exactly right - the SDPs are not this random. The constraints (i.e. the elements of \Omega in my earlier post above) are usually corresponding to edges in of a graph with low degree.

I agree that more sophisticated decompositions (i.e. chordal decompositions) would be very useful to try here. But, I don’t understand them well enough to implement at the moment. We’ve also found that the solutions are always low-rank (much lower rank than (# of constraints)^2) so that’s worth possibly exploiting too.

Nevertheless, I think we’re going to stop trying to optimize the runtime for a while, because we have it down low enough to finish our simulations. If the statistical method we are proposing takes off in the future, it’d be worth re-visiting this and writing a specific solver for the underlying SDPs in our case.

Thanks again for being so generous with your time :slight_smile:

2 Likes

try COSMO.jl, it implements chordal decomposition and does it on its own.

For low rank (i.e. really low) You might have some luck with SDPLR.jl, or ProxSDP.jl.

1 Like