In the second btime, I recycle the memory slots when repeating the powers. The difference should be even larger for higher powers. As far as I can tell, the reason is that ^(A,::Integer) makes a call to Base.power_by_squaring in intfuncs.jl which does not take such memory aspects into account (in particular line 330)
Is there a good repeated squaring in some package in Julia?
The function power_by_squaring seems to be written for scalars and immutable types, I don’t see an easy fix where mul is replaced by mul!. One can make some conditions on the type, but since this is in Base references to any Matrix is not ideal. What fix did you have in mind?
Since I is not a AbstractMatrix, your example does not speak against a type-check. I believe your call reduces to a call to power_by_squaring(::Float64,::Int), which should remain fine.
julia> using StaticArrays
WARNING: using StaticArrays.pop in module Main conflicts with an existing identifier.
julia> A = @SMatrix rand(3,3);
julia> B = similar(A); B .= 3;
julia> convert(typeof(parent(A)), B)
3×3 SMatrix{3, 3, Float64, 9} with indices SOneTo(3)×SOneTo(3):
3.0 3.0 3.0
3.0 3.0 3.0
3.0 3.0 3.0
Can use similar, mutate it, and then use convert to return an answer of the correct type.
typeof(parent(A)) is so that this would work for at least some wrappers like view.
Alternatively, could do a more general
RT = Base.promote_op() do
# capture `A`, not a type!
convert(typeof(parent(A)), B)
end
if RT === typeof(parent(A))
# only try converting if it is known to return the expected type at compile time
convert(typeof(parent(A)), B)
else
B
end
An easier approach is probably to introduce some interface.
In general, the return type for ^(A::AbstractMatrix,::Integer) should probably be what is returned by similar(A) (which is always mutable)? StaticArrays can overload ^(::SMatrix, ::Integer) to call power_by_squaring directly and avoid allocating a mutable copy.
The example shows how the performance can be improved for Matrix. The same does not hold for SparseMatrix because of increased fill-in, not captured well with similar.
Therefore, I’m currently I would say it’s better to have a separate implementation specifically for dense matrices, i.e., ^(::Matrix,::Int), and keep the current as fallback eg for SMatrix and SparseMatrix.