ThreadsX mapreduce performance

Elrod · October 18, 2021, 4:52am

Does it? That’s not intentional.

The naive non-@generated version had a problem since LV already wants to know what the functions are at macro-expansion time.
So another fix to support the non-@generated version is to lift that limitation.

But it will have to know what the op is regardless to know how to correctly re-associate. Is that the problem you were hitting?

julia> function map_tturbo!(f::F, y, x) where {F}
           @tturbo for i in eachindex(y, x)
               y[i] = f(x[i])
           end
       end
map_tturbo! (generic function with 1 method)

julia> x = rand(10_000); y = similar(x);

julia> map_tturbo!(x -> log1p(x)/3, y, x);

julia> y ≈ log1p.(x) ./ 3
true

It not knowing what the function does means it won’t optimize as well. Should also perhaps make sure the function inlines.

In that case, the .instance should only be required for op. f should be free to be relatively arbitrary.

Topic		Replies	Views
Why is the threaded version of this so much slower than the serial and distributed versions? Julia at Scale	2	355	June 23, 2022
Loosing performance with `Threads.@threads` for loop Performance parallel , multithreading , threads	10	679	October 7, 2021
Having issues speeding up code with multithreading Performance parallel , multithreading	19	585	July 16, 2023
Simple multi-thread loop with array Performance question , parallel , multithreading	11	736	April 13, 2021
Using and understanding multi-threading Performance multithreading	1	290	January 11, 2024

ThreadsX mapreduce performance

Related topics