Note that broadcasting expands to loop. Therefore norm is repeatedly called for each element of the array (not once per row), because y2 is two dimensional. Since y2 is updated in place you do not get what you expect. To see this consider:
So what’s essentially going on is loop fusion? Where the .*= means everything broadcasts to the size of y2, rather than the 1-dimension of eachrow(y2)? I’m not seeing it
y2 = copy(x)
for j in 1:size(y2, 2)
for i = 1:size(y2, 1)
row = y2[i, :]
n = LinearAlgebra.norm(row)
inv_n = 1 / n
y2[i, :] .= y2[i, :] * inv_n
This has been discussed before. For pure functions, this is a footgun, but when sideeffects of the function are desired, it isn’t. But somebody who particiapated in those previous discussions can weigh in.
As commented above - this is the issue with dealiasing by broadcasting. So hopefully my PR to Julia would be merged and the issue will be fixed (still - it would be better to precompute norm to get better performance)
Besides @bkamins 's PR (for which I am not an expert reviewer but it seems the right way to go), would it be feasible to emit a compile-time AliasWarning or something like that when this kind of construction is encountered?
This doesn’t seem like an issue with parentheses and order of operations at all, it seems like not recognizing that chained dot syntax fuses the operations into 1 loop. For a simpler example, a .* (b .+ c) does a[i] * (b[i] + c[i]) per iteration. Note that the parentheses does not prevent fusing, in fact it sensibly specifies the order of operations per iteration. The only way to split that example into separate loops is to broadcast separately: temp = b .+ c; a .* temp.
Generally, if your input is being changed per iteration, results will be different if loop fusion has a reducing operation like norm(eachrow(...; put plainly, reducing operations depend on the other elements, including ones changed by previous iterations. y1 was computed by broadcasting two times over an input x, which did not change in the process. y2 was changing over its own broadcast. If the iterations’ inputs are different, of course the output is different.
Changing the input to an unchanging x is a simple way to fix the issue: y2 .*= (1 ./ LinearAlgebra.norm.(eachrow(x)) );. A previous comment suggested collect, which split the line into 2 broadcast loops and made unchanging temporary data based on y2 before it was changed.
Again the problem is that the in-place loop has a reducing operation; the changing output is also the input. In the linked PR you argued that the equivalent NumPy code does not have this issue, but it is allocating temporary arrays in separate loops of each operation. Julia’s fused broadcast loops may look similar but are not equivalent, and I think people should understand the difference.
People have suggested manually inserting unbroadcasted identity calls into a single line to unfuse loops and allocate temporary arrays, like x ./= identity(norm.(eachrow(x))). But that’s a lot of bloat with more dots, and a single line could be transformed by a macro like @unfuse x ./= norm.(eachrow(x))