Hi, I am wondering how to improve the speed of large scale ODEs, which includes 136 equations. I already use the staticarray, and for loop instead of broadcast to speed up the program and it’s already 10 times faster than Matlab. I also tested and found that without saving everystep, it only has 53 allocations, which is `tspan`

independent, that means the program already quite good. I am just wondering anyone can give me more advices to improve it. Any suggestions are welcome. Here is my program, the `Constants.jl`

doesn’t need to change, and the long `Test3DMolasses.jl`

can be get in this link

https://github.com/xspeng1999/Test3DMolasses/blob/main/Test3DMolasses

I know it’s quite long, I just wondering whether I make obvious stupid mistakes, thanks in advance~

```
using DifferentialEquations, LinearAlgebra, StaticArrays
using Random, Statistics
include("Test3DMolasses.jl")
include("Constants.jl")
# parameters defination
vx, vy, vz = rand(3)
rx, ry, rz = rand(3)
theta1, theta2 = 2pi * rand(2)
para01 = @SVector [vx, vy, vz, rx, ry, rz, theta1, theta2]
delta = 20rand(4, 12)
delta = SMatrix{4,12,Float64}(delta)
s = @SVector [1.0, 1.0, 1.0, 1.0]
Polar = @SVector [1.0, 1.0, 1.0, -1.0]
Len = length(Polar)
a = complex(zeros(Len))
b = complex(zeros(Len))
c = complex(zeros(Len))
p = (para01, delta, s, Polar, a, b, c, Len)
# initial value
II = zeros(136)
II[1] = 1.0 / 12.0
II[13] = 1.0 / 12.0
II[24] = 1.0 / 12.0
II[34] = 1.0 / 12.0
II[43] = 1.0 / 12.0
II[51] = 1.0 / 12.0
II[58] = 1.0 / 12.0
II[64] = 1.0 / 12.0
II[69] = 1.0 / 12.0
II[73] = 1.0 / 12.0
II[76] = 1.0 / 12.0
II[78] = 1.0 / 12.0
u0 = complex(II)
tend = 1e-5
tstart = 0
tspan = (tstart * Gamma, tend * Gamma)
prob = ODEProblem(Test3DMolasses, u0, tspan, p)
@time y = solve(prob, Tsit5(), save_everystep=false) # 53 allocations in my computer
```

The following program is saved in a seperate file `Constants.jl`

```
## Physical constants
const h = 6.626e-34
Mu = 1.660538e-27
Mn = 59
const Mm = Mu * Mn
const lamda = 606.3e-9
const k = 2pi / lamda
const Gamma = 2pi * 8.3e6
const velfact = Gamma / k
ZXtemp = [0, 0, 0, 76.2939, 122.9512, 122.9512, 122.9512, 147.8177, 147.8177, 147.8177, 147.8177, 147.8177] # importdata('X_ZeemanCaF.mat'); # 0:1:1000G
const ZX = round.((ZXtemp .- ZXtemp[1]) / 8.3, digits=2) # unit Gamma relative to the lowest level round to 0.01 Gamma
ZBtemp = [-5 * 2 * pi * 1e6, 0, 0, 0] / Gamma
const ZB = round.(ZBtemp, digits=2) # round(ZB*100)/100; # round to 0.01Gamma
#=
#Fp 0 1 1 1
# 13 14 15 16
#mFp 0 1 0 -1
ftemp = [-0.1355 0.4649 -0.4645 0.0000 # 1 1 1
-0.1355 0.4650 -0.0001 -0.4641 # 0 2 1
-0.1354 -0.0000 0.4646 -0.4642 # -1 3 1
-0.0020 -0.4705 0.4714 -0.4722 # 0 4 0
0.5614 0.0000 -0.1853 0.1846 # -1 5 1
-0.5612 0.1865 -0.0010 -0.1832 # 0 6 1
-0.5610 0.1851 -0.1844 0.0000 # 1 7 1
-0.0000 0.0000 -0.0000 0.4083 # -2 8 2
0.0009 -0.0000 0.2880 0.2894 # -1 9 2
-0.0011 -0.1659 -0.3334 -0.1675 # 0 10 2
-0.0009 -0.2880 -0.2894 0.0000 # 1 11 2
0.0000 0.4083 0.0000 0.0000] # 2 12 2
=#
ftemp = [-0.1354 0.4646 0.4646 0
0.1354 -0.4646 0 0.4646
-0.1354 0 -0.4646 -0.4646
0 0.4714 0.4714 0.4714
0.5612 0 0.1849 0.1849
-0.5612 0.1849 0 -0.1849
0.5612 -0.1849 -0.1849 0
0 0 0 0.4082
0 0 0.2887 -0.2887
0 0.1667 -0.3333 0.1667
0 -0.2887 0.2887 0
0 0.4082 0 0] # from Yang's method
for i = 1:4
ftemp[:, i] = ftemp[:, i] ./ sqrt(sum(ftemp[:, i] .^ 2))
end
const f = ftemp
const r = ftemp # the branching ratio
```