Thanks!
I manage to run the codes:
using ThreadsX
using BenchmarkTools
my_function(x,y) = x+y
xs=rand(1000)
ys=rand(1000)
my_Function(xy) = my_function(xy[1], xy[2])
@btime map(my_Function, Iterators.product(xs, ys))
@btime ThreadsX.map(my_Function, Iterators.product(xs, ys))
@btime ThreadsX.map(z->my_function(z...), Iterators.product(xs, ys))
@btime map(z->my_function(z...), Iterators.product(xs, ys))
but ThreadsX doesn’t seem to improve on performance:
517.969 μs (6 allocations: 7.63 MiB)
4.365 ms (3862 allocations: 52.20 MiB)
4.834 ms (3863 allocations: 52.20 MiB)
517.939 μs (6 allocations: 7.63 MiB)