With help from @ffevotte and @Benny, I think I finally understand how to describe A ./ norm.(eachrow(A)) conceptually. Consider the matrix A = [3 4; 6 8; 9 12]. At first I imagined conceptually that norm should be broadcasted over eachrow(A) to produce [5.0, 10.0, 15.0] and then conceptually that vector is expanded to the matrix
[ 5.0 5.0
10.0 10.0
15.0 15.0]
And then we have element-wise division between A and the above matrix of norms. With this conceptual model, norm is only called 3 times.
(*When I say “conceptually”, I mean conceptually. Obviously there are performance optimizations involved, so this is not precisely what happens under the hood.)
However, as @pdeffebach showed above, for this simple example (even without the in-place assignment to A), the norm function gets called 6 times. So the correct conceptual model is that eachrow(A) gets expanded before norm is applied. In detail, we have
1
eachrow(A) is expanded to
# `TEMP` is merely a conceptual variable
TEMP = [
[3, 4] [3, 4]
[6, 8] [6, 8]
[9, 12] [9, 12]
]
2
The following function is applied element wise to the matrices A and TEMP:
function f(a, temp)
a / norm(temp)
end
so the result is
[0.6 0.8
0.6 0.8
0.6 0.8]
and norm gets called 6 times, because f was applied element-wise, i.e. to the six corresponding elements of A and TEMP.
Is there a missing performance optimization here? It seems like we ought to be able to tell that we can apply norm to eachrow(A) and then expand the resulting vector into the second dimension, rather than expanding eachrow(A) first and then applying norm to the expanded matrix.