Maybe I should add some more informative comments as well.
The following statement can be a bit misleading sometimes:
X[:,:,i] .= A[:,:,i] * B[:,:,i]
One might be tempted to think that this means "perform the matrix multiplication in-place at
X[:,:,i]", but that is not what is happening here. I call it misleading because often the expected “in-place” behaviour is what actually emerges, but that depends on the rest of the right hand side.
So then what is happening here? Well basically (if we simplify a little) you are writing a syntactic sugar for
broadcast!(identity, view(X,:,:,i), A[:,:,i] * B[:,:,i])
So in other words: you first index into A and create a temporary matrix of that slice, then the same thing for B. You then use those temporary matrices to compute the matrix multiplication into a new temporary result matrix. Then you more or less map the
identity function over each element of the result matrix and store the resulting value into the corresponding position of
X (though I am not sure if
identity is maybe special-cased and avoided).
So a lot of allocation is going on.
In the example I gave you basically only allocate the views in each iteration of the loop. those are just shallow wrappers and avoid the copy-on-index behaviour outlined above. Then by using
mul! we avoid the temporary result matrix.
Note, if you are interessted you can look at what is actually happening to the “dot” notation using the following command
julia> Meta.lower(Main, :(X[:,:,i] .= A[:,:,i] * B[:,:,i]))
1 ─ %1 = (Base.getproperty)(Base.Broadcast, :materialize!) │
│ %2 = (Base.dotview)(X, :, :, i) │
│ %3 = (Base.getproperty)(Base.Broadcast, :broadcasted) │
│ %4 = (Base.getindex)(A, :, :, i) │
│ %5 = (Base.getindex)(B, :, :, i) │
│ %6 = %4 * %5 │
│ %7 = (%3)(Base.identity, %6) │
│ %8 = (%1)(%2, %7) │
└── return %8 │