@tkf, I cannot answer your question.
I tried using Folds.sum with DistributedEx, as in the official example. That works:
using Folds
using ThreadsX
using BenchmarkTools
@btime sum(1:10000)
@btime Folds.sum(1:10000)
@btime Folds.sum(1:10000, ThreadedEx())
@btime Folds.sum(1:10000, DistributedEx())
though with no performance gain:
0.024 ns (0 allocations: 0 bytes)
6.198 μs (57 allocations: 3.53 KiB)
6.281 μs (58 allocations: 3.55 KiB)
309.233 μs (243 allocations: 13.42 KiB)
Actually I don’t know how to use Folds.map with DistributedEx for the map example that I supplied with my previous comment. I can do just this:
@btime map(((x,y) for x in rand(10000), y in rand(10000))) do (x,y)
2x+3y
end
@btime ThreadsX.map(((x,y) for x in rand(10000), y in rand(10000))) do (x,y)
2x+3y
end
@btime Folds.map(((x,y) for x in rand(10000), y in rand(10000))) do (x,y)
2x+3y
end
but again with no performance gain
406.065 ms (6 allocations: 763.09 MiB)
1.924 s (4317 allocations: 3.19 GiB)
1.529 s (530 allocations: 2.51 GiB)