Right multiplication by a Diagonal
matrix effectively scales the columns of a matrix. Currently the default Matrix
-Diagonal
product appears to use copy_similar
:
(*)(A::AbstractMatrix, D::Diagonal) =
rmul!(copy_similar(A, promote_op(*, eltype(A), eltype(D.diag))), D)
which seems a bit wasteful. Wouldn’t an uninitialized matrix be better as the destination here? I compare a naive implementation of this matrix product with and without bounds-checking disabled, and it appears that the present implementation is even less performant than the version with bounds checking.
julia> A = ones(3000, 2000); D = Diagonal(ones(2000));
julia> @btime $A * $D;
42.105 ms (4 allocations: 45.78 MiB)
julia> Base.@propagate_inbounds function muldiag(A::Matrix{T}, D::Diagonal{R}) where {R,T}
@assert size(A,2) == size(D,1) "sizes are incompatible"
C = similar(A, promote_type(T, R)) # some promotion operation
d = diag(D);
for (j, dj) in enumerate(d), i in axes(C, 1)
C[i, j] = dj * A[i, j]
end
return C
end
muldiag (generic function with 1 method)
julia> @btime muldiag($A, $D);
36.357 ms (3 allocations: 45.79 MiB)
julia> @btime @inbounds muldiag($A, $D);
28.553 ms (3 allocations: 45.79 MiB)
julia> A * D == muldiag(A, D)
true
Perhaps there are instances where the copy is essential, but it seems to be an expensive operation
julia> @btime LinearAlgebra.copy_similar($A, Float64);
34.992 ms (2 allocations: 45.78 MiB)
and it appears that we may do without it after all?
Using mul!
instead of rmul!
seems to resolve this:
julia> muldiag(A, D) = mul!(similar(A, promote_type(eltype(A), eltype(D))), A, D);
julia> @btime muldiag($A, $D);
28.734 ms (4 allocations: 45.78 MiB)