Hi, I am performing Montecarlo type method and I would like to perform the Multi-Treding.
Unfortunately, I am a beginner in Multi-threading for operations of the form n_{i}/n_{i-1}.
Usually, I perform multiple cores to perform diferent iterations of the same code and I do not
care for the order of the output. But, now I need to put the ordered outputs in an array to perform Prod(n_{i}/n_{i-1}). Here is a simplified version of my code:
using Random
using LinearAlgebra
using Distributions
rad = 1.0
rmax = 4
ra = range(0, rad, length=rmax)
sets_prod=Any[]
for s=1:2
suc_iter, pos_iter, neg_iter, rat_set = Any[], Any[], Any[], Any[]
ratious = Any[]
@sync begin
for ri=1:rmax-1
Threads.@spawn begin
suc, pos, neg = 0, 0, 0
for i=1:10^(5)
d = 0
d=rand(Uniform(-ra[ri],ra[ri]), 3)
println("ri, i = $ri, $i on thread $(Threads.threadid())")
if sum(d) >= 0
suc += 1
d[2]=-d[2]
if sum(d) >= 0
pos += 1
else
neg +=1
end
end
end ################### end iterations
push!(suc_iter,suc)
push!(pos_iter,pos)
push!(neg_iter,neg)
if (pos/suc) > 0
push!(rat_set,(pos/suc))
else
push!(rat_set,(0.0))
end
end #### end spawn
end ####Synd
end ##### end radii
println("rat_set: ", rat_set)
for k=2:rmax-1
push!(ratious, rat_set[k]/rat_set[k-1])
end
push!(sets_prod,prod(ratious))
end ############# end sets
println("sets", sets_prod)
The problem is that I want parallelize the loop for ri=1:rmax-1 and after that obtain
the ordered ratios for ri=1,2,3 ..., this because I need the product at the end. I tried
with `spawn` and `@sync` and I am able to send every part of the loop to different
threads but I am not able to collect the results in
push!(suc_iter,suc)
push!(pos_iter,pos)
push!(neg_iter,neg)
in the correct order. I tried Atomic operations but I did not make it work.
Does anyone has a suggestion. Thanks in advance
If you do a known number of iterations and collect the same number of elements into each array, you may allocate the storage in advance and set suc_iter[ri] = suc etc. instead of push!, I guess.
Thank you for your answer. What I understood is that is better to create an array since the beginning
in the way:
suc_iter = fill(0.0, iters, iters)
and then fill it with each suc from every iteration for ri. The problem that I have using that method is that I usually have arrays constructed after 10^8 iterations and then 120 realizations of the code and my computer rapidly runs out of memory, I hope I have had it understand correctly.
Please correct me if I did not.
Thanks for the help.
But you have the same problem if you start with an empty vector and push! to it, as long as the vectors end up the same size. I don’t understand why you don’t just pre-allocate. In fact, I would expect pre-allocating with cause less memory use.
But if you are going to create several length 10^8 vectors (of eltype Any) times 120, and keep it all in memory at the same time, that will just not work. It has nothing to do with multithreading, you are just running out of memory.
Also, you absolutely should put your code inside functions. Working in global scope like this is terrible for performance and memory use.
Before you start using advanced features like multithreading, you should read the performance tips to get rid of basic performance mistakes: Performance Tips · The Julia Language
I suggest you do the following: Read the performance tips. Then create a minimal example (MWE), which is a function that returns the quantity or quantities you need to get.
Right now, your example isn’t minimal. It contains a lot of stuff that does nothing. For example, suc_iter, pos_iter, neg_iter are not used for anything. Can you just delete them from your MWE? If you write a function with an explicit return statement, we will know which operations and variables that can be deleted and optimized. Right now we can’t know which parts of the code are important and which are not.
@sync for ri in 1:rmax-1
Threads.@spawn begin
...
for i in 1:10^5
...
end
suc_iter[ri] = suc
...
end
end
Or something like that.
And in any case, I recommend you to follow @DNF’s advice on decomposing your logic into functions with clear inputs instead of writing the results into global variables. That will make it easier for the community as well to further help you.
Thank you all for the answers, they are really helpful. Also, I need to learn more about this language, and now I know why my program consumes a good amount of memory size. I will follow the advice of @DNF
and the code of @Vasily_Pisarev. I will post the new function later. Thanks for the help.