# Sum of n array by recursion in Parallel threads using spawn

I am creating a function that will be call n number of times in recursion to sum the elements of an array in parallel using spawn to divide the array into two parts.

For example Recursion(a, lo, hi)

a = (1,5,6,7,8,8,8,8,9,9)

rt1 = recurision(a, lo, mid)
rt2 = recursion(a, mid+1, hi)

then adding the sum of the return values from the first and second call. I am unable to store the values of rt1 and rt2 in a global variable to return the total sum of an array.

what have you tried so far?

2 Likes

The problem is when I return the sum, it only returns the sum of the last two recursions, because I have to initialize the value of the return which gets reset in every run

you’re doing it wrong if your “recursion” modifies/relies on some global variable

3 Likes

Can you please suggest the right way?

Here is the naive way of recursively writing the Fibonacci sequence:

``````julia> function fib(n)
if n > 1
fib(n-1) + fib(n-2)
else
n
end
end
fib (generic function with 1 method)
``````

See how at each recursive step, you pass a the decremented argument back to the function? You can do something very similar with `sum`, except it’d involve splitting original the array into halves at each step recursively (and ideally, using `view` so you don’t allocate a whole new array each time)

``````julia> using ThreadsX

julia> let v = randn(1_000_000)
s2 = @btime sum(\$v)
s1 ≈ s2
end
75.279 μs (860 allocations: 36.28 KiB)
251.748 μs (0 allocations: 0 bytes)
true
``````

For smaller array sizes it’s pretty easy to do a lot better than ThreadsX is doing here unfortunately. Not really sure what’s wrong with it’s strategy, but a simple recursive `@spawn` sum can be a fair bit faster for smaller arrays.

3 Likes

Thanks, I am aware of the basic recursive functions, but here when I use Spawn to divide the tasks and run them in parallel, I am unable to get the point of summing the output of spawned tasks.

Thanks, I am aware of the basic recursive functions,

Ah sorry, it’s a little hard to understand what you do and don’t know here, as the way you talked about using a global variable was setting off a lot of warning lights for me.

when I use Spawn to divide the tasks and run them in parallel, I am unable to get the point of summing the output of spawned tasks.

Does this help?

``````function fib(n::Int)
if n < 2
return n
end
t = @spawn fib(n - 2)
return fib(n - 1) + fetch(t)
end
``````

The idea is that you first spawn one task, then do the other computation, and then use `fetch` to get the result of the first task and add them together.

2 Likes

If you post the code that you already have, it will be easier to help.

1 Like

If you just want to sum one array in parallel, a nice simple way to do it is

``````function tsum(xs)
if length(xs) <= 1048576
return sum(xs)  # base case
else
By the way, unless you wnat to learn how to implement parallel very low-level code, I strongly advice to not use `@spawn` directly. It’s especially true for a high-level task like this. FYI, I wrote an overview for the high-level approach here: A quick introduction to data parallelism in Julia