I hope to implement a thread-parallel sample, a common example in solving time-dependent evolution problems (e.g. main_serial
in the code.
Generate a matrix arr_task
over time. This matrix is updated over loop/time and the matrix arr_task
generated by the current loop is dependent on the result of arr_task
generated by the last loop/time, so it must be generated serially. Then pass arr_task
into process!()
to be processed, and the result of the process!()
is stored in matrix arr _stored
. These two steps give the result of processing the arr_task
at each time.
The main_serial
in the code is a serial program. For the sake of parallelism I defined a function main_parallel_copy
where I tried to use copy
to pass arr_task
into process!()
for processing based on multithreads. However the result is different from the serial program, it seems that there is a data race (presumed based on the output after I run it). With the help of julia Chinese group, I assigned arr_task
to arr_temp
: arr_temp = copy(arr_task)
, then passed arr_temp
into process!()
as shown in the function main_parallel_assign
.
Running environment:
win10: Microsoft Windows [Version 10.0.17763.4252]
julia Version 1.8.5
How to run: julia -t4 test.jl
This is the code from the test.jl script:
function main_serial()
arr_stored = zeros(Int, 4)
arr_task = zeros(Int, 1)
for time in 1:4
# get a task, should be serial
arr_task[1] = time + arr_task[1]
sleep(1)
# handle the task
process!(arr_stored, arr_task, time)
end
println("serial : $arr_stored")
end
function main_parallel_copy()
arr_stored = zeros(Int, 4)
arr_task = zeros(Int, 1)
@sync for time in 1:4
# get a task, should be serial
# println("copy!")
arr_task[1] = time + arr_task[1]
sleep(1)
# handle the task, but parallel on threads
Threads.@spawn process!(arr_stored, copy(arr_task), time)
end
println("parallel copy : $arr_stored")
end
function main_parallel_assign()
arr_stored = zeros(Int, 4)
arr_task = zeros(Int, 1)
@sync for time in 1:4
# get a task, should be serial
arr_task[1] = time + arr_task[1]
sleep(1)
arr_temp = copy(arr_task)
# handle the task, but parallel on threads
Threads.@spawn process!(arr_stored, arr_temp, time)
end
println("parallel assign : $arr_stored")
end
function process!(stored, task, t)
# time of processing
@time begin a = rand(100,100)
[exp(a) for i in 1:100]
end
stored[t] = task[1]
end
@time main_serial()
println()
@time main_parallel_copy()
println()
@time main_parallel_assign()
These are the output:
2.040268 seconds (1.50 k allocations: 46.069 MiB, 2.20% gc time)
2.136096 seconds (1.50 k allocations: 46.069 MiB, 1.82% gc time)
2.072491 seconds (1.50 k allocations: 46.069 MiB, 1.06% gc time)
2.073747 seconds (1.50 k allocations: 46.069 MiB, 0.64% gc time)
serial : [1, 3, 6, 10]
12.372814 seconds (6.39 k allocations: 184.297 MiB, 0.96% gc time, 0.32% compilation time)
2.692697 seconds (2.70 k allocations: 82.098 MiB, 1.35% gc time)
2.884239 seconds (3.36 k allocations: 101.799 MiB, 1.26% gc time)
2.676654 seconds (2.80 k allocations: 84.476 MiB, 1.36% gc time)
2.124010 seconds (1.52 k allocations: 46.069 MiB, 1.87% gc time)
parallel copy : [3, 6, 10, 10]
7.806365 seconds (7.28 k allocations: 184.345 MiB, 0.98% gc time, 0.26% compilation time)
2.763133 seconds (2.64 k allocations: 80.335 MiB, 2.10% gc time)
3.105436 seconds (3.35 k allocations: 101.645 MiB, 1.87% gc time)
2.889880 seconds (2.78 k allocations: 83.787 MiB, 3.14% gc time)
2.118461 seconds (1.52 k allocations: 46.069 MiB, 0.87% gc time)
parallel assign : [1, 3, 6, 10]
8.015358 seconds (6.95 k allocations: 184.329 MiB, 1.36% gc time, 0.12% compilation time)
My question are:
- why is it that after copying an array into a function , the array can still be updated externally? See
main_parallel_copy
. - How else can I parallelize this type of loop based on threads?
- In julia threads parallelization, is there any option to set variable attribute like “firstprivate” in openmp?