@charshaw I modified the script a bit to run mwe twice; here are my results of @time optimize!
within:
for n in {1..8} do
OMP_NUM_THREADS=$n julia using_JuMP_multithreading.jl;
I fixed n = 450
;
passing
scs_opt = optimizer_with_attributes(
SCS.Optimizer,
"linear_solver"=>SCS.MKLDirectSolver,
"eps_abs"=>1e-8, # to increase the iteration count
"verbose"=>false,
)
to mwe_program
I get
Creating the problem instance... n = 450, OMP_NUM_THREADS = 1
7.061236 seconds (1.04 k allocations: 90.613 MiB)
Creating the problem instance... n = 450, OMP_NUM_THREADS = 2
5.475805 seconds (1.04 k allocations: 90.613 MiB)
Creating the problem instance... n = 450, OMP_NUM_THREADS = 3
5.021955 seconds (1.04 k allocations: 90.613 MiB)
Creating the problem instance... n = 450, OMP_NUM_THREADS = 4
4.565751 seconds (1.04 k allocations: 90.613 MiB)
Creating the problem instance... n = 450, OMP_NUM_THREADS = 5
4.389557 seconds (1.04 k allocations: 90.613 MiB)
Creating the problem instance... n = 450, OMP_NUM_THREADS = 6
4.413405 seconds (1.04 k allocations: 90.613 MiB)
Creating the problem instance... n = 450, OMP_NUM_THREADS = 7
4.360352 seconds (1.04 k allocations: 90.613 MiB)
Creating the problem instance... n = 450, OMP_NUM_THREADS = 8
4.339999 seconds (1.04 k allocations: 90.613 MiB)
so about 1.6 faster leveling around 4-5 threads. This comes from the following two lines: Code search results · GitHub
So either
- you have plenty of psd constraints and project in parallel (mwe has only one), or
- the (sparse CSC) matrix
A
defining the problem has enough density so that parallelizingA'x
over its columns is beneficial.
Note: I don’t observe the same scaling with SCS.DirectSolver
, solve time stays around 7
s.