Okay, I’m reporting back here: we got about 1.4x speed up on the HPC using this multi-threaded SCS. I appreciate the help!

Looking forward to hearing about how to use the new distributed capabilities when you get the chance!

Okay, I’m reporting back here: we got about 1.4x speed up on the HPC using this multi-threaded SCS. I appreciate the help!

Looking forward to hearing about how to use the new distributed capabilities when you get the chance!

@charshaw SCS is not a **distributed** solver. As indicated above there are only two places where it can benefit from **multithreading** (openmp) and the speedup will be very much problem dependent.

You’re probably thinking of MPI-enabled solver, but this is not what SCS supports. Maybe @odow knows if any solver wrapped in MOI can use MPI?

Maybe @odow knows if any solver wrapped in MOI can use MPI?

Nope. Again, trying to solve this on some sort of HPC is the wrong approach. You can scale to a few cores of a laptop or workstation.

2 Likes

I concur the only benefits of my remote workstation over my laptop are that I don’t hear it and I can run a problem and disconnect/close my laptop/etc and still get a solution in a few hours.

Tbh since I got a laptop with modern Ryzen the workstation is noticeably slower

@charshaw I wanted to ask this earlier: are the sd problems really that random, or do they have some internal structure? In such cases you’re always better using sparsity (chordal decomposition) and/or symmetry (Wedderburn + orbit decomposition) than throwing more computational resources on the problem.

2 Likes

Thanks again for your help @odow and @abulak ! I really appreciate it. It’s brought down the run time to less than 24 hours, which is very helpful for our cluster scheduler.

I’ve accepted @abulak 's answer for using multi-threading, because I think it addresses what I was getting at!

You’re exactly right - the SDPs are not this random. The constraints (i.e. the elements of \Omega in my earlier post above) are *usually* corresponding to edges in of a graph with low degree.

I agree that more sophisticated decompositions (i.e. chordal decompositions) would be very useful to try here. But, I don’t understand them well enough to implement at the moment. We’ve also found that the solutions are always low-rank (much lower rank than (# of constraints)^2) so that’s worth possibly exploiting too.

Nevertheless, I think we’re going to stop trying to optimize the runtime for a while, because we have it down low enough to finish our simulations. If the statistical method we are proposing takes off in the future, it’d be worth re-visiting this and writing a specific solver for the underlying SDPs in our case.

Thanks again for being so generous with your time

2 Likes

try COSMO.jl, it implements chordal decomposition and does it on its own.

For low (i.e. really low) You might have some luck with SDPLR.jl, or ProxSDP.jl.