Julia vs NumPy broadcasting

Do we need batched matrix multiplication? At least if @mikmoore 's version is correct, the loop is doing repeated matrix-vector multiplication x[a, j] = V[a,b] * yj[b] in a loop over j, whose batched version is matrix-matrix multiplication, x[a, j] = V[a,b] * ys[b, j]. For me that’s 20x quicker:

function step_reference3(A::Matrix, B::Vector, F::Real, n::Int)  # my adaptation of @mikmoore's version
    λ, V = eigen(A)
    BF_transformed = V \ (B * F)  # 4-element Vector{Float64}
    ys = expm1.(λ .* (1:n)') ./ λ .* BF_transformed
    x = V * ys  # x[a, j] = V[a,b] * ys[b, j]
end

@btime step_reference($A, $B, $F, $n)  #  2.602 ms (58011 allocations: 3.29 MiB)
@btime step_reference2($A, $B, $F, $n)  # 3.401 ms (45033 allocations: 2.22 MiB)
@btime step_reference3($A, $B, $F, $n)  # 155.708 μs (41 allocations: 326.47 KiB)

While loops are quick in Julia, loops which allocate small vectors are expensive. (The above SMatrix version didn’t run for me, on a quick attempt.)

5 Likes