Only explicit solvers work for stiff PDE-derived ODEs, stiff solvers crashing

I apologize for posting about this PDE many times over the past few days, but I am pursuing several different approaches and each comes with its own issues.

The system is given by:

\begin{gathered} \frac{\partial c_{1}}{\partial t}=\frac{1}{\varepsilon}\left[f c_{2} \frac{q-c_{1}}{q+c_{1}}+c_{1} \frac{1-m c_{2}}{1-m c_{2}+\varepsilon_{1}}-c_{1}^{2}\right]+D_{1} \nabla^{2} c_{1} \\ \frac{\partial c_{2}}{\partial t}=c_{1} \frac{1-m c_{2}}{1-m c_{2}+\varepsilon_{1}}-c_{2}+D_{2} \nabla^{2} c_{2} \end{gathered}

where:

\begin{equation} d \equiv \frac{D_{1}}{D_{2}}=d_{0}\left(1+\frac{K-1}{1+\frac{3}{\Theta}}\right)^{3 / 2} \end{equation}

and we’re using no-flux (Neumann 0) boundary conditions.

My best attempt so far is cobbled together from many tutorials, @ChrisRackauckas’s blog posts, and slack advice, leading to the following pared-down example (not quite a MWE, sorry).

using DifferentialEquations, LinearAlgebra, SparseArrays, Symbolics
using ModelingToolkit, SparseDiffTools, Sundials

# Generate the constants
N = 60
dx = 0.1
Ax = Array(Tridiagonal([1.0 for i in 1:N-1],[-2.0 for i in 1:N],[1.0 for i in 1:N-1]))
Ay = copy(Ax)
Ax[2,1] = 2.0
Ax[end-1,end] = 2.0
Ay[1,2] = 2.0
Ay[end,end-1] = 2.0

#Parameters
f = 1.2
m = 190
q = 0.001
ϵ₁ = 0.01
ϵ = 0.01
K = 1000
Θ = 0.044
d₀ = 0.01
D₁ = 1
p = [f, m, q, ϵ, ϵ₁, D₁, d₀, K, Θ, dx]

function basic_version!(du,u,p,t)
  f, m, q, ϵ, ϵ₁, D₁, d₀, K, Θ, dx = p
  D₂ = D₁/d₀*(1 + (K-1)/(1+3/Θ))^(-3/2)
  c₁ = @view u[:,:,1]
  c₂ = @view u[:,:,2]
  dc₁ = @view du[:,:,1]
  dc₂ = @view du[:,:,2]
  
  Δ₁ = (1/dx^2)*D₁*(Ay*c₁ + c₁*Ax)
  Δ₂ = (1/dx^2)*D₂*(Ay*c₂ + c₂*Ax)
  @. dc₁ = Δ₁ .+ 1/ϵ*(f*c₂.*(q .- c₁)./(q .+ c₁) .+ c₁ .* (1 .- m*c₂)./(1 .- m*c₂ .+ ϵ₁) .- c₁.^2)
  @. dc₂ = Δ₂ .+ c₁ .* (1 .- m*c₂)./(1 .- m*c₂ .+ ϵ₁) .- c₂
end

tspan = (0.0,0.1)
u0 = rand(Float64, (N,N,2))

# Jacobian Approach
du0 = similar(u0)
sparsity_pattern = Symbolics.jacobian_sparsity(basic_version!,du0,u0,p,0.0)
jac_sparsity = Float64.(sparse(sparsity_pattern))
colorvec = matrix_colors(jac_sparsity)
@show maximum(colorvec)
f = ODEFunction(basic_version!;jac_prototype=jac_sparsity,colorvec=colorvec)
prob = ODEProblem(f,u0,tspan,p)
## Solve!
println("Solving")
@time sol = solve(prob,Tsit5(), saveat=range(0, stop=tspan[2], length=101))

This code runs just fine. I haven’t been able to fully verify the solution yet since I don’t have an analytical expression, but it behaves seemingly well and nothing blows up (although Turing patterns don’t show up, I don’t know why yet). It is slow, however, and there are many optimizations left to do. It also scales very poorly with N, and I need to go up to N=300 or so.

The biggest optimization is of course using a stiff solver. I tried quite a few of them, e.g., KenCarp47 with and without linsolve=KrylovJL_GMRES(), CVODE_BDF, etc. They all run for a really long period of time then crash with dt <= dtmin. Changing tolerances hasn’t fixed this. I also tried autodiff=false.

I know this error indicates a possibility of model error and I promise I will keep checking. I am simply curious why the stiff solvers don’t work and the explicit ones work fine.

tl;dr: all stiff solvers I tried don’t work (dt <= dtmin), non-stiff explicit ones do work just fine. What’s going on?

Resources I’ve used (non-exhaustive):

Dont you want c1,c2 >= 0? Then, I am not sure Tsit5 enforces that. You can try simple solvers to understand the dynamics, eg ImplicitEuler()

2 Likes

Won’t that kind of production term have explosive growth for values near 1? Are you sure that size of the initial conditions makes sense?

2 Likes

I do want that, so thanks for pointing this out and the advice on using ImplicitEuler().

Thank you, that does significantly improve the problem. Now stiff solvers do work after altering the initial condition:

u0 = 1e-3*rand(Float64, (N,N,2))

The 1e-3 comes from the image below, which is the steady state of the solution after some unknown time, and the only part of the paper I managed to find some scale (the paper doesn’t really go into much detail regarding this, unfortunately).

image

Using the modified initial condition, I can now use stiff solvers and somewhat reproduce one of the patterns in the paper:

image

Here’s something that looks like pattern (b) with \Theta = 0.009
labyrinthine_0x009

When I try to simulate something in Region I to get figure (a), I get the following non-spotted pattern (\Theta = 0.002)
white_spots_0x002

And trying to simulate (c) and (d) (e.g., \Theta = 0.045), I get oscillatory behavior
oscillatory_0x045

Note:

  • I am limited to using a 60 \times 60 grid due to computational resources (M1 mac mini), the paper uses a 300x300 grid. I am also using dx = dy = 0.25 while they use dx = dy = 0.1. I’ve tried tweaking these a bit but they haven’t helped, but I’m trying to get cluster access or move to my gaming PC with a much better processor.

Any thoughts on this? I’d appreciate any feedback.

Yeah to describe this issue here, you just need a nonlinear ODE like u' = u^2 - u. Notice that if u > 1, then u' > 0 since u^2 > u. So then u will increase, which will make u' increase quadratically, rinse and repeat and you’ll go to infinity in essentially finite time (in terms of floating point at least). So bounding that initial condition is important for these autocatalytic models.

You shouldn’t need a cluster for this. Play with ideas like Solving Large Stiff Equations · DifferentialEquations.jl and devectorizing a bit.

2 Likes

Thank you for the advice as always Chris. I have tried the following:

  1. Full devectorization following the fast_gm! example here, this is 30-40% slower than the code above.
  2. Adding a preconditioner: doesn’t converge (used algebraic multigrid from here)
  3. Automatic Jacobian sparsity detection was already there, with color vectors.

I’m quite curious why devectorization leads to more allocations and is a fair bit slower. It does reproduce the exact same result as before though so I think it’s correct.

Full Code:

using DifferentialEquations, LinearAlgebra, SparseArrays, Symbolics
using Plots
using ModelingToolkit, SparseDiffTools, AlgebraicMultigrid

# Numerics
N = 60
dx = 0.25
x = range(0, stop=30.0, step=dx)
Ax = Array(Tridiagonal([1.0 for i in 1:N-1],[-2.0 for i in 1:N],[1.0 for i in 1:N-1]))
Ay = copy(Ax)
Ax[2,1] = 2.0
Ax[end-1,end] = 2.0
Ay[1,2] = 2.0
Ay[end,end-1] = 2.0

# Parameters
f = 1.2
m = 190
q = 0.001
ϵ₁ = 0.01
ϵ = 0.01
K = 1000
Θ = 0.002
d₀ = 0.01
D₁ = 1
p = [f, m, q, ϵ, ϵ₁, D₁, d₀, K, Θ, dx]

function basic_version!(du,u,p,t)
  f, m, q, ϵ, ϵ₁, D₁, d₀, K, Θ, dx = p
  D₁ = D₁/dx^2
  D₂ = D₁/d₀*(1 + (K-1)/(1+3/Θ))^(-3/2)

  @inbounds for j in 2:N-1, i in 2:N-1
    du[i,j,1] = D₁*(u[i-1,j,1] + u[i+1,j,1] + u[i,j+1,1] + u[i,j-1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)

  end

  @inbounds for j in 2:N-1, i in 2:N-1
    du[i,j,2] = D₂*(u[i-1,j,2] + u[i+1,j,2] + u[i,j+1,2] + u[i,j-1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2] 
  end

  @inbounds for j in 2:N-1
    i = 1
    du[1,j,1] = D₁*(2u[i+1,j,1] + u[i,j+1,1] + u[i,j-1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
  end
  @inbounds for j in 2:N-1
    i = 1
    du[1,j,2] = D₂*(2u[i+1,j,2] + u[i,j+1,2] + u[i,j-1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2] 
  end
  @inbounds for j in 2:N-1
    i = N
    du[end,j,1] = D₁*(2u[i-1,j,1] + u[i,j+1,1] + u[i,j-1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
  end
  @inbounds for j in 2:N-1
    i = N
    du[end,j,2] = D₂*(2u[i-1,j,2] + u[i,j+1,2] + u[i,j-1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2] 
  end

  @inbounds for i in 2:N-1
    j = 1
    du[i,1,1] = D₁*(u[i-1,j,1] + u[i+1,j,1] + 2u[i,j+1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
  end
  @inbounds for i in 2:N-1
    j = 1
    du[i,1,2] = D₂*(u[i-1,j,2] + u[i+1,j,2] + 2u[i,j+1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2] 
  end
  @inbounds for i in 2:N-1
    j = N
    du[i,end,1] = D₁*(u[i-1,j,1] + u[i+1,j,1] + 2u[i,j-1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
  end
  @inbounds for i in 2:N-1
    j = N
    du[i,end,2] = D₂*(u[i-1,j,2] + u[i+1,j,2] + 2u[i,j-1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2] 
  end

  @inbounds begin
    i = 1; j = 1
    du[1,1,1] = D₁*(2u[i+1,j,1] + 2u[i,j+1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
    du[1,1,2] = D₂*(2u[i+1,j,2] + 2u[i,j+1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2] 

    i = 1; j = N
    du[1,N,1] = D₁*(2u[i+1,j,1] + 2u[i,j-1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
    du[1,N,2] = D₂*(2u[i+1,j,2] + 2u[i,j-1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2] 

    i = N; j = 1
    du[N,1,1] = D₁*(2u[i-1,j,1] + 2u[i,j+1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
    du[N,1,2] = D₂*(2u[i-1,j,2] + 2u[i,j+1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2] 

    i = N; j = N
    du[end,end,1] = D₁*(2u[i-1,j,1] + 2u[i,j-1,1] - 4u[i,j,1]) +
                    1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
    du[end,end,2] = D₂*(2u[i-1,j,2] + 2u[i,j-1,2] - 4u[i,j,2]) +
                    u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2] 
   end
end

tspan = (0.0,5.0)
u0 = 1e-3*rand(Float64, (N,N,2))

# Jacobian Approach
du0 = similar(u0)
sparsity_pattern = Symbolics.jacobian_sparsity(basic_version!,du0,u0,p,0.0)
jac_sparsity = Float64.(sparse(sparsity_pattern))
colorvec = matrix_colors(jac_sparsity)
@show maximum(colorvec)
f = ODEFunction(basic_version!;jac_prototype=jac_sparsity,colorvec=colorvec)
prob = ODEProblem(f,u0,tspan,p)

function algebraicmultigrid(W,du,u,p,t,newW,Plprev,Prprev,solverdata)
  if newW === nothing || newW
    Pl = aspreconditioner(ruge_stuben(convert(AbstractMatrix,W)))
  else
    Pl = Plprev
  end
  Pl,nothing
end

Base.eltype(::AlgebraicMultigrid.Preconditioner) = Float64

# Solve!
println("Solving")
@time sol = solve(prob,KenCarp47(linsolve=KrylovJL_GMRES(),precs=algebraicmultigrid,concrete_jac=true), saveat=range(0, stop=tspan[2], length=101), abstol=1e-8, reltol=1e-8)

anim = @animate for i in 1:length(sol.t)
  tit = "t = $(sol.t[i])" 
  heatmap(sol.u[i][:,:,1], c=:grays, title=tit, aspect_ratio=:equal, axis=([], false), colorbar=false)
end

gif(anim, fps=5)

EDIT: using `

KenCarp4(linsolve=KLUFactorization(), precs=algebraicmultigrid)

on the original version (not devectorized) leads to a 5x speedup.

I am surprised by several things

  • your code looks insanely sophisticated. Even if more performant than mine, you can see why I say this from code 2d
  • for ~7000 variables, you are saying it is slow? Unless the dynamics is super fancy, I am surprised. Sparse LU should be more than enough
  • can’t you try this (constant) preconditionner lu(I + diagm(D1*∇2,D2*∇2)? I have had a lot of success with this.
1 Like

You were using a global constant N but didn’t const it.

using DifferentialEquations, LinearAlgebra, SparseArrays, Symbolics
using Plots
using ModelingToolkit, SparseDiffTools, AlgebraicMultigrid

# Numerics
const N = 60
dx = 0.25
x = range(0, stop=30.0, step=dx)
Ax = Array(Tridiagonal([1.0 for i in 1:N-1],[-2.0 for i in 1:N],[1.0 for i in 1:N-1]))
Ay = copy(Ax)
Ax[2,1] = 2.0
Ax[end-1,end] = 2.0
Ay[1,2] = 2.0
Ay[end,end-1] = 2.0

# Parameters
f = 1.2
m = 190
q = 0.001
ϵ₁ = 0.01
ϵ = 0.01
K = 1000
Θ = 0.002
d₀ = 0.01
D₁ = 1
p = [f, m, q, ϵ, ϵ₁, D₁, d₀, K, Θ, dx]

function basic_version!(du,u,p,t)
  f, m, q, ϵ, ϵ₁, D₁, d₀, K, Θ, dx = p
  D₁ = D₁/dx^2
  D₂ = D₁/d₀*(1 + (K-1)/(1+3/Θ))^(-3/2)

  @inbounds for j in 2:N-1, i in 2:N-1
    du[i,j,1] = D₁*(u[i-1,j,1] + u[i+1,j,1] + u[i,j+1,1] + u[i,j-1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)

  end

  @inbounds for j in 2:N-1, i in 2:N-1
    du[i,j,2] = D₂*(u[i-1,j,2] + u[i+1,j,2] + u[i,j+1,2] + u[i,j-1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2]
  end

  @inbounds for j in 2:N-1
    i = 1
    du[1,j,1] = D₁*(2u[i+1,j,1] + u[i,j+1,1] + u[i,j-1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
  end
  @inbounds for j in 2:N-1
    i = 1
    du[1,j,2] = D₂*(2u[i+1,j,2] + u[i,j+1,2] + u[i,j-1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2]
  end
  @inbounds for j in 2:N-1
    i = N
    du[end,j,1] = D₁*(2u[i-1,j,1] + u[i,j+1,1] + u[i,j-1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
  end
  @inbounds for j in 2:N-1
    i = N
    du[end,j,2] = D₂*(2u[i-1,j,2] + u[i,j+1,2] + u[i,j-1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2]
  end

  @inbounds for i in 2:N-1
    j = 1
    du[i,1,1] = D₁*(u[i-1,j,1] + u[i+1,j,1] + 2u[i,j+1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
  end
  @inbounds for i in 2:N-1
    j = 1
    du[i,1,2] = D₂*(u[i-1,j,2] + u[i+1,j,2] + 2u[i,j+1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2]
  end
  @inbounds for i in 2:N-1
    j = N
    du[i,end,1] = D₁*(u[i-1,j,1] + u[i+1,j,1] + 2u[i,j-1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
  end
  @inbounds for i in 2:N-1
    j = N
    du[i,end,2] = D₂*(u[i-1,j,2] + u[i+1,j,2] + 2u[i,j-1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2]
  end

  @inbounds begin
    i = 1; j = 1
    du[1,1,1] = D₁*(2u[i+1,j,1] + 2u[i,j+1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
    du[1,1,2] = D₂*(2u[i+1,j,2] + 2u[i,j+1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2]

    i = 1; j = N
    du[1,N,1] = D₁*(2u[i+1,j,1] + 2u[i,j-1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
    du[1,N,2] = D₂*(2u[i+1,j,2] + 2u[i,j-1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2]

    i = N; j = 1
    du[N,1,1] = D₁*(2u[i-1,j,1] + 2u[i,j+1,1] - 4u[i,j,1]) +
                1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
    du[N,1,2] = D₂*(2u[i-1,j,2] + 2u[i,j+1,2] - 4u[i,j,2]) +
                u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2]

    i = N; j = N
    du[end,end,1] = D₁*(2u[i-1,j,1] + 2u[i,j-1,1] - 4u[i,j,1]) +
                    1/ϵ*(f*u[i,j,2]*(q - u[i,j,1])/(q + u[i,j,1]) + u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,1]^2)
    du[end,end,2] = D₂*(2u[i-1,j,2] + 2u[i,j-1,2] - 4u[i,j,2]) +
                    u[i,j,1] * (1-m*u[i,j,2])/(1-m*u[i,j,2]+ϵ₁) - u[i,j,2]
   end
   nothing
end

tspan = (0.0,5.0)
u0 = 1e-3*rand(Float64, (N,N,2))

du = similar(u0)
@code_warntype basic_version!(du,u0,p,0.0)

using BenchmarkTools
@btime basic_version!(du,u0,p,0.0)

Before const N (and without nothing as the return):

3.806 ms (244968 allocations: 3.85 MiB)

After those two changes:

4.029 μs (0 allocations: 0 bytes)

A full 1000x faster.

5 Likes

Brilliant, thank you for the lesson! 10x acceleration for solve(...)!

  • It didn’t quite start this complicated, it looked more like yours but I kept optimizing following the DifferentialEquations.jl manual. I started optimising this early because it was slow, which made testing painful.

  • It was quite slow but I think it might be related to the model? I have given up on this model either way and moved to the Gray-Scott model. Currently solving with ~130k variables (2 x 256 x 256) and a time interval of (0,5000.0) in about 10 minutes on my M1. So I think I need to keep optimizing.

  • Could you point me to somewhere where I can learn to do this? I’d like to compare to what I’m using.

Try adding @tturbo and such as well.

1 Like