Hallo,
I would like to understand how the dot operator works. There are two specific problems:
at Performance Tips · The Julia Language (headline: Consider using views for slicing)
I learned that “…on the left-hand side of an assignment, where array[1:5, :] = ... assigns in-place to that portion of array”. As I understand it, no unnecessary copies are stored in memory.
Now I read here: Functions · The Julia Language that if I use the dot-operator .= :
"If the left-hand side is an array-indexing expression, e.g. X[begin+1:end] .= sin.(Y) , then it translates to broadcast! on a view, e.g. broadcast!(sin, view(X, firstindex(X)+1:lastindex(X)), Y) , so that the left-hand side is updated in-place. "
Now I wonder why the dot near the equal sign in X[begin+1:end] .= sin.(Y) is necessary. Shouldn’t it be updated in-place anyway? What is the advantage of the View in this case?
If I have an expression like: Θ[:,k+1] = K*Θ[:,k]+B
and K is a 10x10 matrix , Θ[:,k] is a 10x1 vector and B is another 10x1 vector.
Can I take advantage of the dot vector in this case?
I can’t use . * because I don’t want a broadcast. So I can’t use @.
Because of the array-indexing expression on the left-hand side the .= seems unnecessary (it’s in-place anyway). And .+ without fusion doesn’t seem to make sense.
Did I get that right?
It would be great if someone could explain this to me
This allocates first a slice Θ[:,k], then the product K*Θ[:,k], then for the sum + B before writing into Θ. Better would be Θ[:,k+1] .= K*view(Θ,:,k) .+ B, the .+ is useful precisely because it can be fused with the .=. Even better would be
Θ[:,k+1] .= B
@views mul!(Θ[:,k+1], K, Θ[:,k], true, true)
In case it wasn’t clear, you absolutely do need the dotted assignment operator here. Without the dot, the values are assigned ‘in-place’ in the variable X. But before that happens, the expression sin.(Y) on the right hand side will create a temporary array. If you want to avoid that temporary array you must dot the entire expression, including the assignment operator.
Without the dot, this
X[begin+1:end] = sin.(Y)
is equivalent to this
temp = sin.(Y) # create temporary array
X[begin+1:end] = temp # assign in-place to X
Thanks a lot macabbott for that really helpful answer!
But I’m still confused about Θ[:,k+1] .= B
I’m not familiar with views. If I don’t assign a copy of B but a view, wouldn’t a subsequent change in Θ then also have to change B? I’ve tried it and seen that it doesn’t happen, but I still don’t understand how this works.
Maybe what you miss is that Θ is only one pointer deep. It owns a continuous block of memory. Or, it’s a mutable container of just numbers, not boxes which contain numbers. We can’t link part of it to B’s memory, all we can do is copy the numbers one by one from B to each overwrite one in Θ. Which is what Θ[:,k+1] = B and Θ[:,k+1] .= B and Θ[:,k+1] .= view(B,:) all do, although the precise chain of functions involved will differ.
Don’t use the @time macro for micro benchmarks, it is not suitable. Install the BenchmarkTools package, read the manual, and then use that. The allocation estimates you get from @time do not tell you what you are looking for.
To be honest, I wonder why @time is in Base. It seems like 99% of the time BenchmarkTools should be used instead. Perhaps it would be better if @time was removed, and then you would have to make a conscious choice when timing code, instead of relying on the built-in default.