So i regularly run SDE simulations. To get a better idea of the systems behavior I would run several identical, and plot the various results. I figured this probably would be a good task for the Parallel Ensemble Simulations part of DifferentialEquations. However, I do not actually manage to get any speedup using parallelism. Also weird things happen.
Here is a simple example code of problems that appear. As seen towards the end, there is no speed up using these approach, also some weird errors happen. Comments in the code.
#We needs distributed to use several processors (?). using Distributed #Checks number of threads. Threads.nthreads() #(It is equal to 4) #While following these processes in the terminal using "top", there is only ever one julia process, with maximum 100% CPU. #Can we use more cores? "cat /proc/cpuinfo | grep processor | wc -l" in the terminal gives us "8". Lets try add another 6 to make it 7. #(Not sure whenever it matters or not, but it feelt good to save 1 for other computer stuff) #Or so was my initial though, but then someone said that since I already have number of threads in Juno equal to 4, I should not add more than another 4 workers. #So I settled for 3 for a starter. addprocs(3) #Fetches DiffEq (and all workes needs to see it). @everywhere using DifferentialEquations #Prepares the test system (Lotka-Volterra). Also ensures that all workers see them @everywhere begin function f(du,u,p,t) du = p * u - p * u*u du = -3 * u + u*u end function g(du,u,p,t) du = p*u du = p*u end p = [1.5,1.0,0.1,0.1] prob = SDEProblem(f,g,[1.0,1.0],(0.0,5000.0),p) end #Prepares parallel ensamble simulations. @everywhere begin prob_func(prob,i,repeat) = prob ensemble_prob = EnsembleProblem(prob,prob_func=prob_func) end #Meassures without using parallel ensemble simulations from DiffEq. function solveN(n) for i = 1:n solve(prob,SRIW1()) end end @time solveN(40) #This takes about 10 seconds. While watching in top, there is only one active julia process. #Checks the speed of vaiour solves using Ensemble: @time solve(ensemble_prob,SRIW1(),trajectories=40) #This takes about 12 seconds. While watching in top, there is only one active julia process. @time solve(ensemble_prob,SRIW1(),EnsembleSerial(),trajectories=40) #This takes about 12 seconds. While watching in top, there is only one active julia process. @time solve(ensemble_prob,SRIW1(),EnsembleThreads(),trajectories=40) #This takes about 13 seconds. While watching in top, there is only one active julia process. @time solve(ensemble_prob,SRIW1(),EnsembleDistributed(),trajectories=40) #This takes about 27 seconds. Top gives 3-4 active proccess. When 3 they typically all have about the same %CPU (towards 100). When 4 three typically have very little (maybe 2 or 3%). @time solve(ensemble_prob,SRIW1(),EnsembleSplitThreads(),trajectories=40) #Running this will yield messages in the console "Worker 3 terminated." and "Worker 2 terminated.". It will end with a ProcessExitedException().
Finally, here is the full output of the
ProcessExitedException() generated at the end.
ProcessExitedException() in top-level scope at base/util.jl:156 in at base/none in #solve#365 at DiffEqBase/AfQA1/src/solve.jl:46 in at base/none in #__solve#305 at DiffEqBase/AfQA1/src/ensemble/basic_ensemble_solve.jl:64 in macro expansion at base/util.jl:213 in macro expansion at DiffEqBase/AfQA1/src/ensemble/basic_ensemble_solve.jl:70 in solve_batch at DiffEqBase/AfQA1/src/ensemble/basic_ensemble_solve.jl:165 in #pmap at base/none in #pmap#213 at stdlib/v1.1/Distributed/src/pmap.jl:126 in #asyncmap at base/none in #asyncmap#680 at base/asyncmap.jl:81 in #async_usemap at base/none in #async_usemap#681 at base/asyncmap.jl:154 in maptwice at base/asyncmap.jl:178 in foreach at base/abstractarray.jl:1866 in at base/asyncmap.jl:178