I’m trying to use the Threads module.
I have long running simulations that I want to run in parallel and save the data to a single file while the simulation is running.
Here is what I have done so far:
using Base.Threads: @threads, @spawn, @sync
using DelimitedFiles
a = zeros(10,5)
@sync for j in 1:5
@spawn for i in 1:10
a[i,j] = j*i
end
sleep(2)
writedlm("test_mt_save.txt", a[:,1:j])
end
which gives the following output in “test_mt_save.txt”
@spawn creates a task that then starts running on an available thread, but while it’s running the main thread keeps running code that follows until told to wait, e.g., via @sync. To illustrate:
@sync begin
@spawn begin
sleep(2)
println("Task done!")
end
println("Waiting for task to complete...") # Executes while spawned task is `sleep`ing
end
println("Done waiting!") # Executes after spawned task completes
But then, why does Jeff Bezanson’s example from his julia computing seminar (minute 15:18) on multithreading look the following way
function escapetime(z; maxiter = 80)
c = z
for n in 1:maxiter
if abs(z) > 2
return n-1
end
z = z^2 + c
end
return maxiter
end
function mandelbrot(; width = 80, height = 20, maxiter = 80)
out = zeros(Int, height, width)
real = range(-2.0, 0.5, length = width)
imag = range(-1.0, 1.0, length = height)
@sync for x in 1:width
@spawn for y in 1:height
z = real[x] + imag[y]*im
out[y,x] = escapetime(z, maxiter = maxiter)
end
end
return out
end
Here clearly, the @sync macro is on the outer loop.
Thanks again!
@sync for x in 1:width
@spawn for y in 1:height
z = real[x] + imag[y]*im
out[y,x] = escapetime(z, maxiter = maxiter)
end
# Code written here will run concurrently with spawned code
end
# Code written here will run after spawned code finishes
return out
Thanks a lot to all of you for taking the time to explain those concepts.
I’ll need a bit more time to grasp all of the details but you put me on the right track!
I think finally found the solution to my problem.
I used the following code in order to write to file correctly while still being able to use multithreading:
using Base.Threads: @threads, @spawn, @sync
using DelimitedFiles
a = zeros(10,5)
for j in 1:5
@sync for i in 1:10
@spawn begin
a[i,j] = sum(rand(5000,5000)*rand(5000,5000))
end
end
writedlm("test_mt_save.txt", a[:,1:j])
end
This way for each value of i, a new task will be spawned. But putting @sync in front of the inner loop, I am sure the tasks finish before writing to file.
It seems to work.
Please correct me if I wrote something which is not correct.
Thanks again,
Olivier
Edit: I edited the code following @StevenWhitaker 's comment about sum(sum(matrix)) = sum(matrix)
By the way, sum(sum(x)) == sum(x) when x is an array of numbers; sum (without the dims keyword argument) sums all the values of the input collection, not just the values along one dimension (like MATLAB does).