Efficient Broadcasting in Julia: CartesianIndex vs. Reshape

I came across a discussion post on using [CartesianIndex()] for singleton insertion in Julia arrays. Intrigued by potential performance differences, I conducted a benchmark comparing it with the reshape method.
Benchmark code:

(M, N, P, K) = (20, 30, 40, 50)
na = [CartesianIndex()]
A = randn(M, N, P)
B = randn(M, N, K)

@time C = A .* B[:,:,na,:]
@time D = A .* reshape(B,M,N,1,K)


0.001471 seconds (50 allocations: 9.385 MiB)
0.001449 seconds (8 allocations: 9.156 MiB)

The reshape approach seems slightly faster, and Iā€™m seeking insights into the reasons behind this performance difference. Any explanations or thoughts are welcome.

The difference here is not the CartesianIndex but rather that the first variant first creates a new array B[:,:,na,:] and then computes, whereas reshape uses the existing data.

If you change it to

julia> @time C = A .* @view B[:,:,na,:];
0.002465 seconds (6 allocations: 9.156 MiB)

the runtime and allocations will be similar in both cases.

Of course, you could go further and try to remove the extra allocations by pre-allocating an array and doing @. C = A * @view B[:,:,na,:] but that is probably not the aim of this question :wink:

1 Like