I’m trying to understand the difference in performance here:
julia> using LinearAlgebra
julia> n=40; U = UpperTriangular(rand(n,n)); C = similar(U); B = Broadcast.broadcasted(*, 2.0, U);
julia> @btime copyto!($C, $B);
307.138 ns (0 allocations: 0 bytes)
julia> B2 = Broadcast.instantiate(B);
julia> @btime copyto!($C, $B2);
483.149 ns (0 allocations: 0 bytes)
Why is it substantially more expensive to copy the instantiated object? This is a small difference that doesn’t scale with matrix size, but I’m trying to understand why this exists at all. I don’t observe a similar difference if Array
s are involved.
This is on the current master.