Reducing the memory allocation in ODEs

Just coming from Matlab, to pursue faster speed. Here is my problem and a simple case
Even run for the second time(after compilation), there are still thousands of allocation, any suggestions are welcome

function Eq(dy, y, p, t)
    v, w0, delta, phi, detwwa, k, gamma = p
    w = 4w0 * (cos(k * v * t) * cos(delta * t) * cos(phi / 2) + 1im * sin(k * v * t) * sin(delta * t) * sin(phi / 2))
    dy[1] = 1im / 2 * (w * conj(y[2]) - conj(w) * y[2]) - gamma * y[1]
    dy[2] = -1im * w / 2 * (y[1] - y[4]) - (gamma / 2 - 1im * detwwa) * y[2]
    dy[3] = 1im * conj(w) / 2 * (y[1] - y[4]) - (gamma / 2 + 1im * detwwa) * y[3]
    dy[4] = -1im / 2 * (w * conj(y[2]) - conj(w) * y[2]) + gamma * y[1]
end

using DifferentialEquations, Plots

hbar = 1.054572e-34
lamda = 359.3e-9
k = 2pi / lamda
gamma = 2pi * 22.2e6
phi = 45pi / 180
delta = 25gamma
w0 = sqrt(1.5) * delta
detwwa = 0.0
forfact = hbar * k * gamma
velfact = gamma / k
v = 1 * velfact
tend = 1.0e-6
tstart = 0
tspan = (tstart, tend)
u0 = complex([0.0, 0.0, 0.0, 1.0])
p = [v, w0, delta, phi, detwwa, k, gamma]
prob = ODEProblem(Eq, u0, tspan, p)
@time sol = solve(prob, Tsit5())

Essentially all of the allocations are from saving the output. That of course has to allocate. By removing the dense output you can reduce it a bit:

function Eq(dy, y, p, t)
    v, w0, delta, phi, detwwa, k, gamma = p
    w = 4w0 * (cos(k * v * t) * cos(delta * t) * cos(phi / 2) + 1im * sin(k * v * t) * sin(delta * t) * sin(phi / 2))
    dy[1] = 1im / 2 * (w * conj(y[2]) - conj(w) * y[2]) - gamma * y[1]
    dy[2] = -1im * w / 2 * (y[1] - y[4]) - (gamma / 2 - 1im * detwwa) * y[2]
    dy[3] = 1im * conj(w) / 2 * (y[1] - y[4]) - (gamma / 2 + 1im * detwwa) * y[3]
    dy[4] = -1im / 2 * (w * conj(y[2]) - conj(w) * y[2]) + gamma * y[1]
end

using DifferentialEquations, Plots

hbar = 1.054572e-34
lamda = 359.3e-9
k = 2pi / lamda
gamma = 2pi * 22.2e6
phi = 45pi / 180
delta = 25gamma
w0 = sqrt(1.5) * delta
detwwa = 0.0
forfact = hbar * k * gamma
velfact = gamma / k
v = 1 * velfact
tend = 1.0e-6
tstart = 0
tspan = (tstart, tend)
u0 = complex([0.0, 0.0, 0.0, 1.0])
p = [v, w0, delta, phi, detwwa, k, gamma]
prob = ODEProblem(Eq, u0, tspan, p)

@time sol = solve(prob, Tsit5());
# 0.008606 seconds (82.83 k allocations: 10.422 MiB)

@time sol = solve(prob, Tsit5(), dense = false);
# 0.004330 seconds (9.25 k allocations: 1.430 MiB)

@time sol = solve(prob, Tsit5(), save_everystep = false);
# 0.003976 seconds (39 allocations: 5.719 KiB)

And notice how low it goes if you don’t save anything.

4 Likes

Thanks for your quick reply, but here I only give a simple case, in reality I need the output to do the further calculation, not only the final output but also the values in between. What I think is to use staticarrays or vectorized operations or something else that julia has to reduce the allocations, but I am not familiar with that, and I am still trying

See the new tutorial on this: Code Optimization for Differential Equations · DifferentialEquations.jl

1 Like

I’d suggest reading Performance Tips · The Julia Language, especially avoiding global non-constant variables like the plague, which is the very first recommendation.

For Chris’ benchmarks above I get

  2.177121 seconds (4.16 M allocations: 235.169 MiB, 5.78% gc time, 99.36% compilation time)
  0.168704 seconds (237.16 k allocations: 14.915 MiB, 27.91% gc time, 94.96% compilation time)
  0.088142 seconds (227.82 k allocations: 13.500 MiB, 91.60% compilation time)

If I make all global variables constants (in a new session):

  0.014117 seconds (82.83 k allocations: 10.422 MiB)
  0.008273 seconds (9.25 k allocations: 1.430 MiB)
  0.007206 seconds (39 allocations: 5.719 KiB)

Thanks, I will do it later

I tried, no obvious difference in my computer compare with current parameter transfer method.

Hi, could you help me see another case, which is more close to my real program though it’s still a simlified version. But it already had all the problems of my final program. Even I set the save_everystep = false, there is still a lot of allocations. That means it has the possibility of improvement. I used the .+,.-,.*,./ to vectorizatied the calculation and use static arrays for all the constant parameters. The idea is from your blog and apply this to my final program improve the speed by one order of magnitude. I just wondering whether I can do more…

https://tutorials.sciml.ai/html/introduction/03-optimizing_diffeq_code.html

using DifferentialEquations, LinearAlgebra, StaticArrays

include("TestSimple02.jl")

s = [1.0 1.0]

s = sqrt.(s)

s = SMatrix{1,2,Float64}(s)   # turn into Static arrays

Polar = [-1 1]

Polar = SMatrix{1,2,Int64}(Polar)   # turn into Static arrays

delta = [2.5 ;-2.5]

delta = SVector{2,Float64}(delta)   # turn into Static arrays

h = 6.626e-34

hbar = h / 2pi

torr = 26.63e-9

Gamma = 1.0 / torr

lamda = 780e-9

k = 2.0pi / lamda

forfact = hbar * k * Gamma

velfact = Gamma / k

# -1             0            +1

f = [0.1826 0 0

    -0.1291 0.1291 0

    0.0745 -0.1491 0.0745

    0 0.1291 -0.1291

    0 0 0.1826]

for i = 1:3

    f[:, i] = f[:, i] ./ sqrt(sum(f[:, i] .^ 2))

end

f = SMatrix{5,3,Float64}(f)   # turn into Static arrays

r = f

theta1 = 0

theta2 = 0

u0 = complex([1 / 5 0 0 0 0 1 / 5 0 0 0 1 / 5 0 0 1 / 5 0 1 / 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0])

wmin = 1e-2

vround = wmin * velfact

tend = 1e-5

tstart = -tend

tspan = (tstart * Gamma, tend * Gamma)

vx = 0.1

vy = 0.1

vz = 0.1

rx = 0.5lamda

ry = 0.5lamda

rz = 0.5lamda

para01=@SVector [vx, vy, vz, rx, ry, rz, theta1, theta2, k, Gamma]

#p =  (vx, vy, vz, rx, ry, rz, theta1, theta2, k, Gamma, r, f, s, delta, Polar)

p=(para01,r, f, s, delta, Polar)

prob = ODEProblem(TestSimple02, u0, tspan, p)

#tinf = @snoopi_deep y = solve(prob, Tsit5())

#@benchmark y = solve(prob, Tsit5())

@time y = solve(prob, Tsit5())

#@timev y = solve(prob, Tsit5())

#@profile y = solve(prob,Tsit5())

#plot!(y.t, real(y[1, :]))

#end
function TestSimple02(dy, y, p, t)

        # s1  s2 are sqrt(Ik./Isat)

        #vx, vy, vz, rx, ry, rz, theta1, theta2, k, Gamma, r, f, s, delta, Polar = p

        para01,r, f, s, delta, Polar = p

        vx, vy, vz, rx, ry, rz, theta1, theta2, k, Gamma=para01

        #a, b, c = AssCal(p, t)

        # sigma.+ means the laser component polarization from the positive direction is .+

        aplus = -1.0im .* exp(1.0im .* theta1) .* cos(k .* rx .+ k .* vx .* t ./ Gamma) .+ exp(1.0im .* theta2) .* sin(k .* ry .+ k .* vy .* t ./ Gamma) .+ cos(k .* rz .+ k .* vz .* t ./ Gamma) .- 1.0im .* sin(k .* rz .+ k .* vz .* t ./ Gamma)

        bplus = sqrt(2.0) .* exp(1.0im .* theta1) .* sin(k .* rx .+ k .* vx .* t ./ Gamma) .+ sqrt(2.0) .* exp(1.0im .* theta2) .* cos(k .* ry .+ k .* vy .* t ./ Gamma)

        cplus = -1.0im .* exp(1.0im .* theta1) .* cos(k .* rx .+ k .* vx .* t ./ Gamma) .- exp(1.0im .* theta2) .* sin(k .* ry .+ k .* vy .* t ./ Gamma) .- cos(k .* rz .+ k .* vz .* t ./ Gamma) .- 1.0im .* sin(k .* rz .+ k .* vz .* t ./ Gamma)

        #sigma.- means the laser component polarization from the positive direction is .+

        aminus = -1.0im .* exp(1.0im .* theta1) .* cos(k .* rx .+ k .* vx .* t ./ Gamma) .- exp(1.0im .* theta2) .* sin(k .* ry .+ k .* vy .* t ./ Gamma) .+ cos(k .* rz .+ k .* vz .* t ./ Gamma) .+ 1.0im .* sin(k .* rz .+ k .* vz .* t ./ Gamma)

        bminus = -sqrt(2.0) .* exp(1.0im .* theta1) .* sin(k .* rx .+ k .* vx .* t ./ Gamma) .+ sqrt(2.0) .* exp(1.0im .* theta2) .* cos(k .* ry .+ k .* vy .* t ./ Gamma)

        cminus = -1.0im .* exp(1.0im .* theta1) .* cos(k .* rx .+ k .* vx .* t ./ Gamma) .+ exp(1.0im .* theta2) .* sin(k .* ry .+ k .* vy .* t ./ Gamma) .- cos(k .* rz .+ k .* vz .* t ./ Gamma) .+ 1.0im .* sin(k .* rz .+ k .* vz .* t ./ Gamma)

        Len = length(Polar)

        a = complex(zeros(Len))

        b = complex(zeros(Len))

        c = complex(zeros(Len))

        for ii = 1:Len

                if Polar[ii] == 1

                        a[ii] = aplus

                        b[ii] = bplus

                        c[ii] = cplus

                elseif Polar[ii] == -1

                        a[ii] = aminus

                        b[ii] = bminus

                        c[ii] = cminus

                end

        end

        factor = 1.0im .* s ./ 2.0sqrt(2.0)

        dy[1] = -dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* conj(y[22]) .- conj(f[1, 1] .* c) .* exp.(.-1.0im .* delta .* t) .* y[22]) .+

                .+r[1, 1] .* r[1, 1] .* y[16]

        dy[2] = -dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* conj(y[25]) .- conj(f[2, 1] .* b) .* exp.(.-1.0im .* delta .* t) .* y[22] .- conj(f[2, 2] .* c) .* exp.(.-1.0im .* delta .* t) .* y[23]) .+

                .+r[1, 1] .* r[2, 2] .* y[17]

        dy[3] = -dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* conj(y[28]) .- conj(f[3, 1] .* a) .* exp.(.-1.0im .* delta .* t) .* y[22] .- conj(f[3, 2] .* b) .* exp.(.-1.0im .* delta .* t) .* y[23] .- conj(f[3, 3] .* c) .* exp.(.-1.0im .* delta .* t) .* y[24]) .+

                .+r[1, 1] .* r[3, 3] .* y[18]

        dy[4] = -dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* conj(y[31]) .- conj(f[4, 2] .* a) .* exp.(.-1.0im .* delta .* t) .* y[23] .- conj(f[4, 3] .* b) .* exp.(.-1.0im .* delta .* t) .* y[24])

        dy[5] = -dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* conj(y[34]) .- conj(f[5, 3] .* a) .* exp.(.-1.0im .* delta .* t) .* y[24])

        dy[6] = -dot(factor, f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* conj(y[25]) .+ f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* conj(y[26]) .- conj(f[2, 1] .* b) .* exp.(.-1.0im .* delta .* t) .* y[25] .- conj(f[2, 2] .* c) .* exp.(.-1.0im .* delta .* t) .* y[26]) .+

                .+r[2, 1] .* r[2, 1] .* y[16] .+ r[2, 2] .* r[2, 2] .* y[19]

        dy[7] = -dot(factor, f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* conj(y[28]) .+ f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* conj(y[29]) .- conj(f[3, 1] .* a) .* exp.(.-1.0im .* delta .* t) .* y[25] .- conj(f[3, 2] .* b) .* exp.(.-1.0im .* delta .* t) .* y[26] .- conj(f[3, 3] .* c) .* exp.(.-1.0im .* delta .* t) .* y[27]) .+

                .+r[2, 1] .* r[3, 2] .* y[17] .+ r[2, 2] .* r[3, 3] .* y[20]

        dy[8] = -dot(factor, f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* conj(y[31]) .+ f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* conj(y[32]) .- conj(f[4, 2] .* a) .* exp.(.-1.0im .* delta .* t) .* y[26] .- conj(f[4, 3] .* b) .* exp.(.-1.0im .* delta .* t) .* y[27]) .+

                .+r[2, 1] .* r[4, 3] .* y[18]

        dy[9] = -dot(factor, f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* conj(y[34]) .+ f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* conj(y[35]) .- conj(f[5, 3] .* a) .* exp.(.-1.0im .* delta .* t) .* y[27])

        dy[10] = -dot(factor, f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* conj(y[28]) .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* conj(y[29]) .+ f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* conj(y[30]) .- conj(f[3, 1] .* a) .* exp.(.-1.0im .* delta .* t) .* y[28] .- conj(f[3, 2] .* b) .* exp.(.-1.0im .* delta .* t) .* y[29] .- conj(f[3, 3] .* c) .* exp.(.-1.0im .* delta .* t) .* y[30]) .+

                 .+r[3, 1] .* r[3, 1] .* y[16] .+ r[3, 2] .* r[3, 2] .* y[19] .+ r[3, 3] .* r[3, 3] .* y[21]

        dy[11] = -dot(factor, f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* conj(y[31]) .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* conj(y[32]) .+ f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* conj(y[33]) .- conj(f[4, 2] .* a) .* exp.(.-1.0im .* delta .* t) .* y[29] .- conj(f[4, 3] .* b) .* exp.(.-1.0im .* delta .* t) .* y[30]) .+

                 .+r[3, 1] .* r[4, 2] .* y[17] .+ r[3, 2] .* r[4, 3] .* y[20]

        dy[12] = -dot(factor, f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* conj(y[34]) .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* conj(y[35]) .+ f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* conj(y[36]) .- conj(f[5, 3] .* a) .* exp.(.-1.0im .* delta .* t) .* y[30]) .+

                 .+r[3, 1] .* r[5, 3] .* y[18]

        dy[13] = -dot(factor, f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* conj(y[32]) .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* conj(y[33]) .- conj(f[4, 2] .* a) .* exp.(.-1.0im .* delta .* t) .* y[32] .- conj(f[4, 3] .* b) .* exp.(.-1.0im .* delta .* t) .* y[33]) .+

                 .+r[4, 2] .* r[4, 2] .* y[19] .+ r[4, 3] .* r[4, 3] .* y[21]

        dy[14] = -dot(factor, f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* conj(y[35]) .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* conj(y[36]) .- conj(f[5, 3] .* a) .* exp.(.-1.0im .* delta .* t) .* y[33]) .+

                 .+r[4, 2] .* r[5, 3] .* y[20]

        dy[15] = -dot(factor, f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* conj(y[36]) .- conj(f[5, 3] .* a) .* exp.(.-1.0im .* delta .* t) .* y[36]) .+

                 .+r[5, 3] .* r[5, 3] .* y[21]

        dy[16] = dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* conj(y[22]) .+ f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* conj(y[25]) .+ f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* conj(y[28]) .- conj(f[1, 1] .* c) .* exp.(.-1.0im .* delta .* t) .* y[22] .- conj(f[2, 1] .* b) .* exp.(.-1.0im .* delta .* t) .* y[25] .- conj(f[3, 1] .* a) .* exp.(.-1.0im .* delta .* t) .* y[28]) .+

                 .-y[16]

        dy[17] = dot(factor, f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* conj(y[25]) .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* conj(y[28]) .+ f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* conj(y[31]) .- conj(f[1, 1] .* c) .* exp.(.-1.0im .* delta .* t) .* y[23] .- conj(f[2, 1] .* b) .* exp.(.-1.0im .* delta .* t) .* y[26] .- conj(f[3, 1] .* a) .* exp.(.-1.0im .* delta .* t) .* y[29]) .+

                 .-y[17]

        dy[18] = dot(factor, f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* conj(y[28]) .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* conj(y[31]) .+ f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* conj(y[34]) .- conj(f[1, 1] .* c) .* exp.(.-1.0im .* delta .* t) .* y[24] .- conj(f[2, 1] .* b) .* exp.(.-1.0im .* delta .* t) .* y[27] .- conj(f[3, 1] .* a) .* exp.(.-1.0im .* delta .* t) .* y[30]) .+

                 .-y[18]

        dy[19] = dot(factor, f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* conj(y[26]) .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* conj(y[29]) .+ f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* conj(y[32]) .- conj(f[2, 2] .* c) .* exp.(.-1.0im .* delta .* t) .* y[26] .- conj(f[3, 2] .* b) .* exp.(.-1.0im .* delta .* t) .* y[29] .- conj(f[4, 2] .* a) .* exp.(.-1.0im .* delta .* t) .* y[32]) .+

                 .-y[19]

        dy[20] = dot(factor, f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* conj(y[29]) .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* conj(y[32]) .+ f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* conj(y[35]) .- conj(f[2, 2] .* c) .* exp.(.-1.0im .* delta .* t) .* y[27] .- conj(f[3, 2] .* b) .* exp.(.-1.0im .* delta .* t) .* y[30] .- conj(f[4, 2] .* a) .* exp.(.-1.0im .* delta .* t) .* y[33]) .+

                 .-y[20]

        dy[21] = dot(factor, f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* conj(y[30]) .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* conj(y[33]) .+ f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* conj(y[36]) .- conj(f[3, 3] .* c) .* exp.(.-1.0im .* delta .* t) .* y[30] .- conj(f[4, 3] .* b) .* exp.(.-1.0im .* delta .* t) .* y[33] .- conj(f[5, 3] .* a) .* exp.(.-1.0im .* delta .* t) .* y[36]) .+

                 .-y[21]

        dy[22] = dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* y[1] .+ f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* y[2] .+ f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* y[3]) .+

                 -dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* y[16]) .+

                 .-1 ./ 2 .* y[22]

        dy[23] = dot(factor, f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* y[2] .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* y[3] .+ f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* y[4]) .+

                 -dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* y[17]) .+

                 .-1 ./ 2 .* y[23]

        dy[24] = dot(factor, f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* y[3] .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* y[4] .+ f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* y[5]) .+

                 -dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* y[18]) .+

                 .-1 ./ 2 .* y[24]

        dy[25] = dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* conj(y[2]) .+ f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* y[6] .+ f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* y[7]) .+

                 -dot(factor, f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* y[16] .+ f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* conj(y[17])) .+

                 .-1 ./ 2 .* y[25]

        dy[26] = dot(factor, f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* y[6] .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* y[7] .+ f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* y[8]) .+

                 -dot(factor, f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* y[17] .+ f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* y[19]) .+

                 .-1 ./ 2 .* y[26]

        dy[27] = dot(factor, f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* y[7] .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* y[8] .+ f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* y[9]) .+

                 -dot(factor, f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* y[18] .+ f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* y[20]) .+

                 .-1 ./ 2 .* y[27]

        dy[28] = dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* conj(y[3]) .+ f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* conj(y[7]) .+ f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* y[10]) .+

                 -dot(factor, f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* y[16] .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* conj(y[17]) .+ f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* conj(y[18])) .+

                 .-1 ./ 2 .* y[28]

        dy[29] = dot(factor, f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* conj(y[7]) .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* y[10] .+ f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* y[11]) .+

                 -dot(factor, f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* y[17] .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* y[19] .+ f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* conj(y[20])) .+

                 .-1 ./ 2 .* y[29]

        dy[30] = dot(factor, f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* y[10] .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* y[11] .+ f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* y[12]) .+

                 -dot(factor, f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* y[18] .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* y[20] .+ f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* y[21]) .+

                 .-1 ./ 2 .* y[30]

        dy[31] = dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* conj(y[4]) .+ f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* conj(y[8]) .+ f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* conj(y[11])) .+

                 -dot(factor, f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* conj(y[17]) .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* conj(y[18])) .+

                 .-1 ./ 2 .* y[31]

        dy[32] = dot(factor, f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* conj(y[8]) .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* conj(y[11]) .+ f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* y[13]) .+

                 -dot(factor, f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* y[19] .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* conj(y[20])) .+

                 .-1 ./ 2 .* y[32]

        dy[33] = dot(factor, f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* conj(y[11]) .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* y[13] .+ f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* y[14]) .+

                 -dot(factor, f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* y[20] .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* y[21]) .+

                 .-1 ./ 2 .* y[33]

        dy[34] = dot(factor, f[1, 1] .* c .* exp.(1.0im .* delta .* t) .* conj(y[5]) .+ f[2, 1] .* b .* exp.(1.0im .* delta .* t) .* conj(y[9]) .+ f[3, 1] .* a .* exp.(1.0im .* delta .* t) .* conj(y[12])) .+

                 -dot(factor, f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* conj(y[18])) .+

                 .-1 ./ 2 .* y[34]

        dy[35] = dot(factor, f[2, 2] .* c .* exp.(1.0im .* delta .* t) .* conj(y[9]) .+ f[3, 2] .* b .* exp.(1.0im .* delta .* t) .* conj(y[12]) .+ f[4, 2] .* a .* exp.(1.0im .* delta .* t) .* conj(y[14])) .+

                 -dot(factor, f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* conj(y[20])) .+

                 .-1 ./ 2 .* y[35]

        dy[36] = dot(factor, f[3, 3] .* c .* exp.(1.0im .* delta .* t) .* conj(y[12]) .+ f[4, 3] .* b .* exp.(1.0im .* delta .* t) .* conj(y[14]) .+ f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* y[15]) .+

                 -dot(factor, f[5, 3] .* a .* exp.(1.0im .* delta .* t) .* y[21]) .+

                 .-1 ./ 2 .* y[36]

end

Did you profile it? See

The Juno profiler will tell you which lines of code are allocating. @code_warntype can help you figure out if it’s an inference thing.

3 Likes

Thanks, I will try it

Hi, I have one new question to ask you, because I cannot find similar questions on the internet. I plan to expand my equations to accelerate the program and my problem is here, a simple example can explain my question, if initially the program is

function Eq(dy, y, p, t)
    para01, para02 = p
    dy[1] = dot(para01 * y[2], para02 * y[1]) - y[2]
    dy[2] = dot(para01 * y[1], exp.(para02 * y[2])) - y[3]
    dy[3] = -dot(para01 * y[3], para02 * y[1]) - y[1]
end

using DifferentialEquations, LinearAlgebra, Plots

para01 = [1.0, 2.0]
para02 = [3.0, 4.0]

tstart = 0
tend = 1e-2
tspan = (tstart, tend)
u0 = [0.0, 0.0, 1.0]
p = (para01, para02)
prob = ODEProblem(Eq, u0, tspan, p)
sol = solve(prob, Tsit5())
plot(sol.t, sol[1, :])r paste code here

Then I expand it to

function Eq(dy, y, p, t)
    para01, para02 = p

    for ii = 1:length(para01)
        dy[1] += para01[ii] * y[2] * para02[ii] * y[1]
        dy[2] += para01[ii] * y[1] * exp(para02[ii] * y[2])
        dy[3] += -para01[ii] * y[3] * para02[ii] * y[1]
    end

    dy[1] -= y[2]
    dy[2] -= y[3]
    dy[3] -= y[1]

end

using DifferentialEquations, LinearAlgebra, Plots

para01 = [1.0, 2.0]
para02 = [3.0, 4.0]

tstart = 0
tend = 1e-2
tspan = (tstart, tend)
u0 = [0.0, 0.0, 1.0]
p = (para01, para02)
prob = ODEProblem(Eq, u0, tspan, p)
sol = solve(prob, Tsit5())
plot(sol.t, sol[1, :])

From my understanding, they should be the same, but the final result is different, and here is a very simple case which may not help, but for my real program, if I do this expansion, it can decrease the time by a factor of 3, the problem is the result is different with and without expansion. Thanks in advance!

Changes in operation ordering will change the floating point result. That’s just a property of floating point arithmetic. If you decrease the tolerance and the ODE is not chaotic, they will converge to the same value.