However it looks that they are executed differently and memcpy takes place in .= variant. Am I missing something, or is it a bug\language inconsistency?
The first version results in an allocated array which entries are then in-place assigned to a. Therefore there is an allocated array somewhere within the circshift function which consumes memory.
Thank for replies. I understand what is going on. Standard circshift has to create a buffer. But here https://docs.julialang.org/en/v1/manual/functions/#man-vectorized it is stated that X .= ... allows to preallocate output. Which is not followed for circshift. Does output preallocation works only for trivial math?
In ffts the output will be indeed a ShiftedArrays view and hence that in-place update should work.
However, fft itself allocates intermediate arrays.
There is also ffts!
See my previous reply. But in this case, what is the whole idea of X.=... syntax? It seems to be trivial that if two implementation are given, e.g: f(x)
and f!(out, x)
then z.=f(x) can be evaluated as f!(z, x)?
The dot essentially means “element-wise”. So z .= ... means write ... element-wise into x (think of a for loop). So, a typical use case is something like z .= f.(x) where the operations “apply f element-wise to x” and “assign the result of f.(x) element-wise to z” get fused into a single “for loop” so to avoid a temporary allocation for f.(x).
If you don’t do element-wise applications, like you do with circshift, the dot syntax doesn’t help much (as you’ve seen in your first case). That’s why dedicated in-place functions like circshift! exist.
Ok, so x .= is intended to work only for trivial element-wise operations only? I.e. in your example, if I had a manually vectorized function f(x) then z .= f(x)
would copy data, but autovectorized z .= f.(x)
would do everything inplace?
The thing is that, in contrast to MATLAB (as far as I remember), you wouldn’t want to define a vectorized version of f, say f(y::Matrix{Float64}), but only a scalar variant f(x::Float64) to then simply use f.(y).
Example:
# Let's say we want to compute the `exp(sin(x))` of all elements of a matrix
f(x::Float64) = exp(sin(x))
M = rand(10,10) # our matrix
R = zeros(10,10) # preallocated matrix to store the result
R .= f.(M)
The thing is, in my code I “vectorize” over matrices. But thanks for your help! I think I see the logic behind the syntax. Only comment I have, is that docs ulitmately state that .= allows to preallocate input, but this mechanism is actually very limited, as we just discussed.
If I may ask another question. Is there any standard macro that can translate z .= f(x) to f!(z, x) ? It would be very helpful for algorithm readability.
Well, either use dot-broadcasting then or define in-place versions of your functions. But maybe it isn’t even necessary to “vectorize over matrices”? Perhaps a simple loop will do as well? (I don’t know. Depends on context of course.)
.= doesn’t preallocate anything. It writes into preallocated memory if you will. Where is this mentioned in the docs? Can you point out the paragraph / line?
Well, as we’ve tried to explain above, these two things do very different operations, so why would you want a macro to translate one to the other? Also, not for every f there is an in-place pendant f!.
ft! is not possible since fft! can’t act in place on a ShiftedArrays view. Hence FFTW.jl needs to collect the array.
The performance drawback is usually not that critical but if you run out of memory that’s of course an issue.