Below is a version that (a) uses exp(A) rather than trying to diagonalize A (which is dangerous), and (b) uses StaticArrays in an allocation-free loop. On my computer, it is 3–4x faster than @mcabbott’s stepreference3 above:
using LinearAlgebra, StaticArrays
function step_sgj(A::SMatrix, B::SArray, F::Real, n::Integer)
eᴬ = exp(A)
C = A \ B * F
result = Vector{typeof(C)}(undef, n)
eᴬᵏ = eᴬ
result[1] = (eᴬᵏ - I) * C
for k = 2:n
eᴬᵏ *= eᴬ
result[k] = (eᴬᵏ - I) * C
end
return result
end
result = step_sgj(SMatrix{4,4}(A), SVector{4}(B), F, n)
@show stack(result) ≈ step_reference3(A, B, F, n)
@btime step_reference3($A, $B, $F, $n);
@btime step_sgj($(SMatrix{4,4}(A)), $(SVector{4}(B)), $F, $n);
prints:
stack(result) ≈ step_reference3(A, B, F, n) = true
112.666 μs (41 allocations: 326.47 KiB)
32.958 μs (3 allocations: 160.06 KiB)
(In addition to being faster, I find it a lot easier to decipher what the code is actually doing in this form, rather than the “vectorized” form.)