First, using @views
to avoid allocation at those slicing of arrays together with @inbounds
, I could reduce the execution time from 24.354 ms to 20.632 ms. I did not look very carefully, maybe you can tweak a little more, but I am not sure if the performance will be much good.
However, since you are basically dealing with 3x1 vectors, then you should really using StaticArrays.jl. The SVector
is a immutable vector allocated in the stack that is much faster if you are working with small arrays. I modified you code to use SVector
(hope I did everything right):
using LinearAlgebra
using BenchmarkTools
using StaticArrays
@inline do_rotation(M_in, B1, θ) =
M_in * cos(θ) + cross(B1/norm(B1), M_in) * sin(θ) + B1/norm(B1) * dot(B1/norm(B1), M_in) * (1 - cos(θ))
@inline B1_vec(b1, X, G, F) = SVector{3,Float64}(real(b1), imag(b1), X*G + F)
function sim(X, resol, b1, G, F, M)
Output = Vector{SVector{3,Float64}}(undef, length(X))
@inbounds @views for x in eachindex(X)
for tt = 1:length(b1)-1
B1 = B1_vec(b1[tt], X[x], G[tt], F[tt])
if abs(norm(B1)*resol) != 0.0
M[tt+1] = do_rotation(M[tt], B1, norm(B1) * resol)
else
M[tt+1] = M[tt]
end
end
Output[x] = M[end]
end
return Output
end
##
time_pts = 1000;
x_pts = 100;
resol = 4e-5;
X = collect(range(-10/2,10/2,length = x_pts));
B = zeros(ComplexF64,time_pts);
G = ones(time_pts)
F = ones(time_pts)
M = [@SVector zeros(3) for _ = 1:time_pts]
M[1] = @SVector [0.0,0.0,0.5]
@btime sim($X, $resol, $B, $G, $F, $M)
Now, take a look at the benchmark:
julia> include("teste_views.jl")
2.352 ms (1 allocation: 2.50 KiB)
It is 10x faster. The only “problem” is that your output is a vector of vectors instead of a big matrix. But you can convert it easily.