With help from @ffevotte and @Benny, I think I finally understand how to describe A ./ norm.(eachrow(A))
conceptually. Consider the matrix A = [3 4; 6 8; 9 12]
. At first I imagined conceptually that norm
should be broadcasted over eachrow(A)
to produce [5.0, 10.0, 15.0]
and then conceptually that vector is expanded to the matrix
[ 5.0 5.0
10.0 10.0
15.0 15.0]
And then we have element-wise division between A
and the above matrix of norms. With this conceptual model, norm
is only called 3 times.
(*When I say “conceptually”, I mean conceptually. Obviously there are performance optimizations involved, so this is not precisely what happens under the hood.)
However, as @pdeffebach showed above, for this simple example (even without the in-place assignment to A
), the norm
function gets called 6 times. So the correct conceptual model is that eachrow(A)
gets expanded before norm
is applied. In detail, we have
1
eachrow(A)
is expanded to
# `TEMP` is merely a conceptual variable
TEMP = [
[3, 4] [3, 4]
[6, 8] [6, 8]
[9, 12] [9, 12]
]
2
The following function is applied element wise to the matrices A
and TEMP
:
function f(a, temp)
a / norm(temp)
end
so the result is
[0.6 0.8
0.6 0.8
0.6 0.8]
and norm
gets called 6 times, because f
was applied element-wise, i.e. to the six corresponding elements of A
and TEMP
.
Is there a missing performance optimization here? It seems like we ought to be able to tell that we can apply norm
to eachrow(A)
and then expand the resulting vector into the second dimension, rather than expanding eachrow(A)
first and then applying norm
to the expanded matrix.