@threads and vmap

Is it possible to speed this up further?

f(x) = sin(x)cos(x)sin(x)cos(x)
X    = repeat( [randn(10^7)], 5 )

@time                  for x in X;       map(f,x)    end    # 1.0s
@time Threads.@threads for x in X;       map(f,x)    end    # 0.3s
@time Threads.@threads for x in X;      vmap(f,x)    end    # 0.1s

It’d be easier to help if you at the very minimum said what package you got vmap from.

Asking for help making code faster when the code isn’t runnable almost never results in high quality help.

Loop Vectorization
Sorry i forgot there were other vmaps

If this f is your actual function, you can simplify it:

f(x) = (sin(2 * x) / 2)^2

This definition is 5x faster on my computer.

Some time is spent allocating the result of map. The result always has the same shape, so you could pre-allocate the array and use map!.

An easy target here would be to just do 1/5th the amount of work and do

map(f, X[1])

since

contains 5 instances of the identically same array.

It is a bit difficult to grasp what you are asking for. Your example code does not produce any output. So it can be be replaced with nothing, which typically executes in 0.0 seconds.

There are good reasons that we need to know a bit more about the problem, because one of the things that matters when using threads is which results should be accumulated. Is it a short computation inside the loop, a long computation? There are some general advice, and there are some methods which works on particular types of code.

1 Like