I have found in a particular function of mine that it is fastest to perform a calculation in 4 steps. I can logically write this down as one line of code, but the fusing syntax of broadcasting ends up being too greedy and slowing it down.
Is there a way to fence off expressions of code to indicate a boundary for the fusing syntax?
The code in my case is something very similar to:
n = 50
vector = rand(n)
a = reshape(vector, n,1,1)
b = reshape(vector, 1,n,1)
c = reshape(vector, 1,1,n)
# What I would like to do:
out = exp.(a) .* exp.(b) .* exp.(c)
# What is fastest
temp_a = exp.(a)
temp_b = exp.(b)
temp_c = exp.(c)
out = temp_a .* temp_b .* temp_c
The speed difference in benchmarking is roughly a factor of 200 for the above example.
I can guess that the single-line version currently unnecessarily recalculates the exponential for a, b and c in every position of the output array, instead of only once for each row, etcā¦ Is there anyway I could write
out = (@barrier exp.(a)) .* (@barrier exp.(b)) .* ...
instead? I guess I should mention that a
is actually a more complicated expression itself with several operations fused together.