@. weird behavior

From @.'s documentation:
“Convert every function call or operator in expr into a “dot call” (e.g. convert f(x) to f.(x) […]
If you want to avoid adding dots for selected function calls in expr, splice those function calls in with $. […]”

So I find this very weird:

julia> ones(2,2) == ones.(2,2) # dot call here has no effect, as expected
true

julia> ones(2,2)*rand(2,2) # a regular matrix multiplication
2×2 Array{Float64,2}:
 1.54045  1.12931
 1.54045  1.12931

julia> ones(2,2).*rand(2,2) # a "dot call" multiplication
2×2 Array{Float64,2}:
 0.891492  0.744927
 0.647717  0.384459

julia> @. ones(2,2)*rand(2,2) # should be "dot call" multiplication, but isn't
2×2 Array{Float64,2}:
 1.17028  0.784963
 1.17028  0.784963

julia> @. $ones(2,2)*$rand(2,2) # suddenly this fixes it
2×2 Array{Float64,2}:
 0.946279  0.959486
 0.80436   0.596985

What is going on?

good observation. also:

julia> @. ones(2,2) .* rand(2,2)
2×2 Array{Float64,2}:
 1.6686  0.527706
 1.6686  0.527706

This is equivalent to ones.(2,2) .* rand.(2,2). This in turn reduces to broadcast((a,b,c,d) -> ones(a,b) * rand(c,d), 2, 2, 2, 2), and since all the arguments (the (2,2,2,2)) are just scalars, there’s no broadcasting, the function is just evaluated once at (2,2,2,2), hence you get the matrix multiplication behavior.

7 Likes

I think my brain is on the verge of exploding haha.

Ok, I understand why there’s no broadcasting and therefore the matrix multiplication behavior.
I still find it mind-bending that I can do

a = ones.(2,2)
b = rand.(2,2)
c = a.*b

and get the expected behavior, while doing c = ones.(2,2).*rand.(2,2) gets me something completely different.

3 Likes

The only question you need to ask yourself is, do you want to broadcast all the calls in the expression or not? When you call a * b, the only function call is *, but in ones(2,2) * b there are two function calls, one to ones and one to *, and the @. will broadcast both.

1 Like

But that doesn’t seem to be exactly what’s happening.
In my last example I’m broadcasting all function calls: ones, rand, and *:

a, b = ones.(2,2), rand.(2,2)
c = a.*b

On the other hand, as @marius311 was saying, what happens when I instead do c = ones.(2,2).*rand.(2,2) is that – because ones and rand get fed with scalars – the * operator doesn’t get broadcasted. It is this very last thing that I find weird because I’m specifically telling julia to broadcast the multiplication operator.

Here:

using Random
a,    b    =    ones(2, 2), rand(MersenneTwister(123), 2, 2)
a_br, b_br = @. ones(2, 2), rand(MersenneTwister(123), 2, 2)

a == a_br, b == b_br # true
a .* b == a_br .* b_br # true

c0, c1 = @. a*b, a_br*b_br # this is fine
c_br = @. ones(2, 2)*rand(MersenneTwister(123), 2, 2) # this does it differently
c0 == c1   # obviously true
c0 == c_br # now this is false

Again, I understand what is going on in that last @. call, but I still find it very counter-intuitive that, because the ones and rand calls don’t get broadcasted in broadcast((a,b,c,d) -> ones(a,b)*rand(c,d), 2, 2, 2, 2), the * call doesn’t get broadcasted either

That’s why broadcast has to be a syntax rather than an optimization in this case.

And that’s also why I don’t recommand using @.. Especially when you are mixing functions that returns an array and functions that do not.

No, you never tell julia to do that. You are telling julia to broadcast the expression, not .*.

2 Likes

It is not supposed to — did you read the manual on fusion? Or see this blog post.

1 Like

Very interesting readings.

So, let’s see if I got it right this time.

From using Meta.@lower I see that ones.(2,2).*rand.(2,2) is basically being “translated” into

julia> using .Broadcast: materialize, broadcasted

julia> bc = broadcasted(*, broadcasted(ones, 2, 2), broadcasted(rand, 2, 2))
Base.Broadcast.Broadcasted(*, (Base.Broadcast.Broadcasted(ones, (2, 2)), Base.Broadcast.Broadcasted(rand, (2, 2))))

julia> materialize(bc)
2×2 Array{Float64,2}:
 1.08929  0.944512
 1.08929  0.944512

here bc holds everything necessary to “unfold” the computation, which if I understand correctly is then only performed by the materialize call.

Somewhat differently instead, if I call ones(2,2).*rand(2,2) this translates into

julia> bc = broadcasted(*, ones(2,2), rand(2,2))
Base.Broadcast.Broadcasted(*, ([1.0 1.0; 1.0 1.0], [0.19239965952523885 0.8796361629139042; 0.1939935696169941 0.0831963706586214]))

julia> materialize(bc)
2×2 Array{Float64,2}:
 0.1924    0.879636 
 0.193994  0.0831964

which I could also achieve by writing

a, b = broadcasted(ones, 2, 2), broadcasted(rand, 2, 2)
bc = broadcasted(*, materialize(a), materialize(b))
materialize(bc)

So, I went and checked how materialize works. I’m not entirely sure I understand it but, from what I got, it instantiates the (otherwise lazy) Broadcasted object.
So the difference between calling it only only on the “fused” expression and calling it in each single case is:

  • doing @. ones(2,2)*rand(2,2) does not “apply” (instantiate) any function/object until the very end
  • doing broadcasted(*, ones(2,2), rand(2,2)) instead “instantiates” the two arrays before calling the broadcast operation (though instantiate is probably not the correct term here).

I think I got it. I still find it unintuitive from a “high-level” perspective, but at least I understand a bit what’s going on now (which will hopefully prevent bugs in the future).

No. There’s nothingn to “instantiate”, or anything about ones(2, 2) an rand(2, 2). Yes you get the same result by doing a no-op broadcast, but that’s not at all what happen when you call ones and rand.

I think you are getting sidetracked here by conflating the semantics of broadcasting with the implementation.

The latter is a very elegant part of Base and exemplifies a lot of advanced techniques so it is worth studying, but the semantics are very simple: in this particular case, mostly equivalent to

map((a,b,c,d) -> ones(a,b)*rand(c,d), 2, 2, 2, 2)

because numbers are iterable collections containing themselves.

2 Likes