Substantial increase in time in copying an instantiated broadcasted object vs a non-instantiated one

jishnub · January 11, 2024, 11:09am

I’m trying to understand the difference in performance here:

julia> using LinearAlgebra

julia> n=40; U = UpperTriangular(rand(n,n)); C = similar(U); B = Broadcast.broadcasted(*, 2.0, U);

julia> @btime copyto!($C, $B);
  307.138 ns (0 allocations: 0 bytes)

julia> B2 = Broadcast.instantiate(B);

julia> @btime copyto!($C, $B2);
  483.149 ns (0 allocations: 0 bytes)

Why is it substantially more expensive to copy the instantiated object? This is a small difference that doesn’t scale with matrix size, but I’m trying to understand why this exists at all. I don’t observe a similar difference if Arrays are involved.

This is on the current master.

abraemer · January 11, 2024, 11:48am

Cannot quite reproduce the drastic difference on 1.10:

julia> using LinearAlgebra, BenchmarkTools

julia> n=40; U = UpperTriangular(rand(n,n)); C = similar(U); B = Broadcast.broadcasted(*, 2.0, U);

julia> @btime copyto!($C, $B);
  303.206 ns (0 allocations: 0 bytes)

julia> B2 = Broadcast.instantiate(B);

julia> @btime copyto!($C, $B2);
  336.258 ns (0 allocations: 0 bytes)

jishnub · January 11, 2024, 11:49am

Yes, the difference is more pronounced on the current master than on v1.10

Topic		Replies	Views
Unknown allocation Performance	5	574	February 10, 2020
Design of efficient lazy broadcastable operator Performance	11	123	April 14, 2025
Broadcast of .== slow performance, allocations Performance question , performance	5	795	May 16, 2019
Why is a multi-argument inplace map much faster in this case than a broadcast? Performance question , broadcast , map	16	674	December 12, 2022
Broadcasting slower than for-loop New to Julia	6	427	December 13, 2023

Substantial increase in time in copying an instantiated broadcasted object vs a non-instantiated one

Related topics