I am creating a function that will be call n number of times in recursion to sum the elements of an array in parallel using spawn to divide the array into two parts.
then adding the sum of the return values from the first and second call. I am unable to store the values of rt1 and rt2 in a global variable to return the total sum of an array.
The problem is when I return the sum, it only returns the sum of the last two recursions, because I have to initialize the value of the return which gets reset in every run
Here is the naive way of recursively writing the Fibonacci sequence:
julia> function fib(n)
if n > 1
fib(n-1) + fib(n-2)
else
n
end
end
fib (generic function with 1 method)
See how at each recursive step, you pass a the decremented argument back to the function? You can do something very similar with sum, except itâd involve splitting original the array into halves at each step recursively (and ideally, using view so you donât allocate a whole new array each time)
By the way, if youâre looking for a ready-made parallel sum, check out the ThreadsX.jl package:
julia> using ThreadsX
julia> let v = randn(1_000_000)
s1 = @btime ThreadsX.sum($v, basesize=100_000)
s2 = @btime sum($v)
s1 â s2
end
75.279 Îźs (860 allocations: 36.28 KiB)
251.748 Îźs (0 allocations: 0 bytes)
true
For smaller array sizes itâs pretty easy to do a lot better than ThreadsX is doing here unfortunately. Not really sure whatâs wrong with itâs strategy, but a simple recursive @spawn sum can be a fair bit faster for smaller arrays.
Thanks, I am aware of the basic recursive functions, but here when I use Spawn to divide the tasks and run them in parallel, I am unable to get the point of summing the output of spawned tasks.
Thanks, I am aware of the basic recursive functions,
Ah sorry, itâs a little hard to understand what you do and donât know here, as the way you talked about using a global variable was setting off a lot of warning lights for me.
when I use Spawn to divide the tasks and run them in parallel, I am unable to get the point of summing the output of spawned tasks.
Does this help?
function fib(n::Int)
if n < 2
return n
end
t = @spawn fib(n - 2)
return fib(n - 1) + fetch(t)
end
The idea is that you first spawn one task, then do the other computation, and then use fetch to get the result of the first task and add them together.
If you just want to sum one array in parallel, a nice simple way to do it is
function tsum(xs)
if length(xs) <= 1048576
return sum(xs) # base case
else
t = Threads.@spawn tsum(@view xs[begin:begin+(end-begin+1)á2])
return tsum(@view xs[begin+(end-begin+1)á2+1:end]) + fetch(t)
end
end
By the way, unless you wnat to learn how to implement parallel very low-level code, I strongly advice to not use @spawn directly. Itâs especially true for a high-level task like this. FYI, I wrote an overview for the high-level approach here: A quick introduction to data parallelism in Julia