Maybe I’ve found another case, and since it’s so highly related to this topic, I just unaccept your answer and put it here. I’m using Julia 1.10.2.
julia> using LinearAlgebra
julia> using BenchmarkTools
julia> buffer = zeros(ComplexF64, 10000);
julia> psi = reshape(view(buffer, 1:5000), 50, 100);
julia> hpsi = reshape(view(buffer, 5001:10000), 50, 100);
julia> function _f1!(psi, hpsi)
view(psi, :, 1) .+= view(hpsi, :, 1)
end
_f1! (generic function with 1 method)
julia> function _f2!(psi, hpsi)
BLAS.axpy!(50, complex(1.0, 0.0), pointer(hpsi, 1), 1, pointer(psi, 1), 1)
end
_f2! (generic function with 1 method)
julia> @btime _f1!(psi, hpsi);
4.547 μs (3 allocations: 78.27 KiB)
julia> @btime _f2!(psi, hpsi);
45.122 ns (1 allocation: 16 bytes)
julia> @btime _f1!($psi, $hpsi);
4.585 μs (2 allocations: 78.17 KiB)
julia> @btime _f2!($psi, $hpsi);
29.575 ns (0 allocations: 0 bytes)
I think they are doing the same thing and the last two measurements should have no artifacts?