Julia vs NumPy broadcasting

mikmoore · July 21, 2025, 3:35pm

Loops are not idiomatic (when possible) in Python because they are very slow. But loops in Julia are fast (just as fast as broadcasting) and you should expect similar performance to a batched multiply (maybe marginally worse if the batched stuff is being clever).

In fact, your code would be more idiomatic if you did less batching and more looping!

function step_reference2(A, B, F, n)
    # Eigendecomposition: A = V Diagonal(λ) V_inv
    λ, V = eigen(A)
    # Precompute constant term
    BF_transformed = V \ (B * F)
    x = zeros(eltype(λ), size(V, 1), n)
    for j in 1:n
        # this matches what you original wrote but appears to be wrong, scaling in the wrong dimension of V
        # @views mul!(x[:, j], V, BF_transformed) # x[:,j] = V * BF_transformed
        # @views x[:, j] .*= expm1.(λ .* j) ./ λ # apply the scaling
        # what I suspect you intended
        y = expm1.(λ .* j) ./ λ .* BF_transformed # apply the scaling
        @views mul!(x[:, j], V, y) # x[:,j] = V * y
    end
    return x
end

This should have similar (or slightly better) performance and (I would say) is simpler than the Python. Note that I suspect a mistake in your original implementation (at least the Julia version) and provided an alternative. In the non-batched version, it is much harder to make the “scaling the wrong dimension” mistake that probably originally appeared in the batched version.

Also, expm1 (available in both Julia and NumPy) can be much more accurate than exp(x)-1.

Topic		Replies	Views
Numpy 10x faster than Julia ?! What am I doing wrong ?! [solved - julia faster now] Performance question	37	11135	October 15, 2019
Why is this Julia code considerably slower than Matlab New to Julia performance	64	8621	March 5, 2017
Linear solver \(A, B) performance vs Matlab A\b General Usage	32	7820	May 21, 2017
Kron vs scalar product speed difference. python code faster? New to Julia question	41	4243	January 14, 2017
Multiply many-matrices by many-vectors Performance matlab , parallel , multithreading , tensors	33	6852	December 14, 2018

Julia vs NumPy broadcasting

Related topics