Matrix power not memory optimal

It seems that computing A^p with integer p for a dense matrix A with ^(A,p) is not memory allocation optimal:

julia> using BenchmarkTools, LinearAlgebra
julia> A=randn(256,256); A=A/norm(A);
julia> @btime A8=A^8;
  813.054 μs (6 allocations: 1.50 MiB)
julia> @btime begin; 
    A2=similar(A); mul!(A2,A,A); 
    A4=similar(A); mul!(A4,A2,A2); 
    A8=A2; mul!(A8,A4,A4); end;
  729.419 μs (4 allocations: 1.00 MiB)
julia> @btime ((A^2)^2)^2;
  752.717 μs (6 allocations: 1.50 MiB)

In the second btime, I recycle the memory slots when repeating the powers. The difference should be even larger for higher powers. As far as I can tell, the reason is that ^(A,::Integer) makes a call to Base.power_by_squaring in intfuncs.jl which does not take such memory aspects into account (in particular line 330)

Is there a good repeated squaring in some package in Julia?

1 Like

this would be a pretty good first pr for someone. the fix is quite straightforward

The function power_by_squaring seems to be written for scalars and immutable types, I don’t see an easy fix where mul is replaced by mul!. One can make some conditions on the type, but since this is in Base references to any Matrix is not ideal. What fix did you have in mind?

1 Like

How would it handle immutable matrices like FillArrays.jl or StaticArrays.jl?

1 Like

StaticArrays are not necessarily immutable. Not sure we have many immutable matrices, but I get your point.

1 Like

They hide in many places

julia> using LinearAlgebra

julia> I
UniformScaling{Bool}
true*I

julia> (2I)^3
UniformScaling{Int64}
8*I

Since I is not a AbstractMatrix, your example does not speak against a type-check. I believe your call reduces to a call to power_by_squaring(::Float64,::Int), which should remain fine.

Not saying I have a good solution though…

My bad. But I’m not sure it’s easy to statically deduce whether a matrix is mutable or not?

1 Like
julia> using StaticArrays
WARNING: using StaticArrays.pop in module Main conflicts with an existing identifier.

julia> A = @SMatrix rand(3,3);

julia> B = similar(A); B .= 3;

julia> convert(typeof(parent(A)), B)
3×3 SMatrix{3, 3, Float64, 9} with indices SOneTo(3)×SOneTo(3):
 3.0  3.0  3.0
 3.0  3.0  3.0
 3.0  3.0  3.0

Can use similar, mutate it, and then use convert to return an answer of the correct type.

typeof(parent(A)) is so that this would work for at least some wrappers like view.
Alternatively, could do a more general

RT = Base.promote_op() do
  # capture `A`, not a type!
  convert(typeof(parent(A)), B)
end
if RT === typeof(parent(A))
    # only try converting if it is known to return the expected type at compile time
    convert(typeof(parent(A)), B)
else
    B
end

An easier approach is probably to introduce some interface.

1 Like

In general, the return type for ^(A::AbstractMatrix,::Integer) should probably be what is returned by similar(A) (which is always mutable)? StaticArrays can overload ^(::SMatrix, ::Integer) to call power_by_squaring directly and avoid allocating a mutable copy.

The example shows how the performance can be improved for Matrix. The same does not hold for SparseMatrix because of increased fill-in, not captured well with similar.

julia> A=sprandn(256,256,0.1); A=A/norm(A);
julia> @btime A8=A^8;
  34.995 ms (19 allocations: 3.01 MiB)
julia> @btime begin; 
          A2=similar(A); mul!(A2,A,A); 
          A4=similar(A); mul!(A4,A2,A2); 
          A8=A2; mul!(A8,A4,A4); end;
  205.170 ms (51 allocations: 3.26 MiB)

Therefore, I’m currently I would say it’s better to have a separate implementation specifically for dense matrices, i.e., ^(::Matrix,::Int), and keep the current as fallback eg for SMatrix and SparseMatrix.

Perhaps this should be specialized for a StridedMatrix, which should cover most dense types.

2 Likes