Yeah, when I was playing with depth == X, I found (obvious in hindsight) that I couldn’t get more than N threads to trigger, regardless of depth. Another possibility might be depth % X == 0, so you’ll get some subspawning to get the threadcount up, but still stay under the one calc per thread.
Here is the code I was using to assess, by the way. It will graphically show you the recursion and thread structure. Maybe you can modify it to more closely resemble what you’ve got.
N = 3
rootid(depth) = depth == 0 ? 0 : sum(N^x for x in 0:(depth-1))
newid(job, depth) = rootid(depth+1) + N*(job-rootid(depth))
function recursive(job=0, depth=0)
println("$(' '^depth) $job, $depth, $(Threads.threadid())")
depth > N && return job + 1
results = zeros(N)
if depth == 2
Threads.@threads for i in 1:N
results[i] = recursive(newid(job, depth) + i-1, depth + 1)
end
else
for i in 1:N
results[i] = recursive(newid(job, depth) + i-1, depth + 1)
end
end
return sum(results)
end