Trouble Pre-allocating Outputs

First, using @views to avoid allocation at those slicing of arrays together with @inbounds, I could reduce the execution time from 24.354 ms to 20.632 ms. I did not look very carefully, maybe you can tweak a little more, but I am not sure if the performance will be much good.

However, since you are basically dealing with 3x1 vectors, then you should really using StaticArrays.jl. The SVector is a immutable vector allocated in the stack that is much faster if you are working with small arrays. I modified you code to use SVector (hope I did everything right):

using LinearAlgebra
using BenchmarkTools
using StaticArrays

@inline do_rotation(M_in, B1, θ) =
    M_in * cos(θ) + cross(B1/norm(B1), M_in) * sin(θ) + B1/norm(B1) * dot(B1/norm(B1), M_in) * (1 - cos(θ))

@inline B1_vec(b1, X, G, F) = SVector{3,Float64}(real(b1), imag(b1), X*G + F)

function sim(X, resol, b1, G, F, M)

    Output = Vector{SVector{3,Float64}}(undef, length(X))

    @inbounds @views for x in eachindex(X)
        for tt = 1:length(b1)-1
            B1 = B1_vec(b1[tt], X[x], G[tt], F[tt])

            if abs(norm(B1)*resol) != 0.0
                M[tt+1] = do_rotation(M[tt], B1, norm(B1) * resol)
            else
                M[tt+1] = M[tt]
            end
        end
        Output[x] = M[end]
    end

    return Output
end

##
time_pts = 1000;
x_pts = 100;
resol = 4e-5;

X = collect(range(-10/2,10/2,length = x_pts));
B = zeros(ComplexF64,time_pts);
G = ones(time_pts)
F = ones(time_pts)

M = [@SVector zeros(3) for _ = 1:time_pts]
M[1] = @SVector [0.0,0.0,0.5]

@btime sim($X, $resol, $B, $G, $F, $M)

Now, take a look at the benchmark:

julia> include("teste_views.jl")
  2.352 ms (1 allocation: 2.50 KiB)

It is 10x faster. The only “problem” is that your output is a vector of vectors instead of a big matrix. But you can convert it easily.

3 Likes