'BoundsError' on ODEProblem accelerated with ModelingToolkit.jl

Hi,

I have tried to update my ODE problem to run on the lastest versions of OrdinaryDiffEq and ModelingToolkit, however I now encounter a ‘ERROR: BoundsError’ within OrdinaryDiffEq…

I have included a reduced version of my code that reproduces the error.

using OrdinaryDiffEq
using ModelingToolkit

##Constants
const Nx = 20;
const Ny = 10;
const ρ = 2.5;
const kp = 60.0;
const kc =  200.0;
const νd = 1e-9;
const ν0 = 1e-3;
const σn = 100.0;
const τ0 = 0.6;
const Dc = 1e-3;
const a = 0.04;
const b = 0.06;
##ODE Function
function f(u,p,t)
          @inbounds for i in 1:Nx, j in 1:Ny
              du[j,i,1] = @. u[j,i,2] - νd
              du[j,i,3] = @. 1.0 - u[j,i,2]*u[j,i,3]/Dc
          end
           @inbounds for i in 2:Nx-1, j in 2:Ny-1
                du[j,i,2] = @. 1/ρ*kc*(u[j,i+1,1]+u[j,i-1,1]+u[j+1,i,1]+u[j-1,i,1]-4*u[j,i,1]) - 1/ρ*kp*u[j,i,1] - 1/ρ*σn*a*asinh(u[j,i,2]/2/ν0*exp((τ0+b*log(ν0*u[j,i,3]/Dc))/a))
           end
           @inbounds for j in 2:Ny-1
               du[j,1,2] = @. 1/ρ*kc*(u[j,2,1]+u[j+1,1,1]+u[j-1,1,1]-3*u[j,1,1]) - 1/ρ*kp*u[j,1,1]  - 1/ρ*σn*a*asinh(u[j,1,2]/2/ν0*exp((τ0+b*log(ν0*u[j,1,3]/Dc))/a))
               du[j,Nx,2] = @. 1/ρ*kc*(u[j,Nx-1,1]+u[j+1,Nx,1]+u[j-1,Nx,1]-3*u[j,Nx,1])  - 1/ρ*kp*u[j,Nx,1]  - 1/ρ*σn*a*asinh(u[j,Nx,2]/2/ν0*exp((τ0+b*log(ν0*u[j,Nx,3]/Dc))/a))
           end
           @inbounds for i in 2:Nx-1
               du[1,i,2] = @. 1/ρ*kc*(u[1,i+1,1]+u[1,i-1,1]+u[2,i,1]-3*u[1,i,1]) - 1/ρ*kp*u[1,i,1] - 1/ρ*σn*a*asinh(u[1,i,2]/2/ν0*exp((τ0+b*log(ν0*u[1,i,3]/Dc))/a))
               du[Ny,i,2] = @. 1/ρ*kc*(u[Ny,i+1,1]+u[Ny,i-1,1]+u[Ny-1,i,1]-3*u[Ny,i,1]) - 1/ρ*kp*u[Ny,i,1] - 1/ρ*σn*a*asinh(u[Ny,i,2]/2/ν0*exp((τ0+b*log(ν0*u[Ny,i,3]/Dc))/a))
           end
           @inbounds begin
                du[1,1,2] = @. 1/ρ*kc*(u[1,2,1]+u[2,1,1]-2*u[1,1,1]) - 1/ρ*kp*u[1,1,1] - 1/ρ*σn*a*asinh(u[1,1,2]/2/ν0*exp((τ0+b*log(ν0*u[1,1,3]/Dc))/a))
                du[1,Nx,2] = @. 1/ρ*kc*(u[1,Nx-1,1]+u[2,Nx,1]-2*u[1,Nx,1]) - 1/ρ*kp*u[1,Nx,1] - 1/ρ*σn*a*asinh(u[1,Nx,2]/2/ν0*exp((τ0+b*log(ν0*u[1,Nx,3]/Dc))/a))
                du[Ny,1,2] = @. 1/ρ*kc*(u[Ny,2,1]+u[Ny-1,1,1]-2*u[Ny,1,1]) - 1/ρ*kp*u[Ny,1,1] - 1/ρ*σn*a*asinh(u[Ny,1,2]/2/ν0*exp((τ0+b*log(ν0*u[Ny,1,3]/Dc))/a))
                du[Ny,Nx,2] = @. 1/ρ*kc*(u[Ny,Nx-1,1]+u[Ny-1,Nx,1]-2*u[Ny,Nx,1]) - 1/ρ*kp*u[Ny,Nx,1] - 1/ρ*σn*a*asinh(u[Ny,Nx,2]/2/ν0*exp((τ0+b*log(ν0*u[Ny,Nx,3]/Dc))/a))
            end
end
##ODE Speed-up
@variables u[1:Ny, 1:Nx, 1:3]
du = similar(u);
f(collect(u), nothing, 0.0);
du = simplify.(du);
fastf = eval(ModelingToolkit.build_function(du,u,parallel=Symbolics.MultithreadedForm())[2]);
jac = Symbolics.sparsejacobian(vec(du), vec(collect(u)));
fjac = eval(Symbolics.build_function(jac,u,parallel=Symbolics.MultithreadedForm())[2]);
##Initial Conditions
u0 = Array{Float64, 3}(undef,Ny,Nx,3);
u0[:,:,1] = - σn * a /kp* asinh(νd/(2*ν0)*exp((τ0+b*log(ν0/νd))/a)).*ones(Ny,Nx) -1e-6.*rand(Ny,Nx);
u0[:,:,2] = νd.*ones(Ny,Nx);
u0[:,:,3] = Dc/νd.*ones(Ny,Nx);
##solver
tspan = (0.0,1e1)
prob_jac = ODEProblem(ODEFunction((du, u, p, t) -> fastf(du, u),
                                   jac = (du, u, p, t) -> fjac(du, u),
                                   jac_prototype = similar(jac, Float64)),
                                   u0, tspan);
#solver algorithm which produces error
solve(prob_jac,Rodas5(),reltol = 1e-8,abstol = 1e-8,maxiters = 1e10,save_everystep = false,dtmin = 1e-20,dt = 1e-10,force_dtmin =true);
#other solver algorithms don't produce errors
solve(prob_jac,Tsit5(),reltol = 1e-8,abstol = 1e-8,maxiters = 1e10,save_everystep = false,dtmin = 1e-20,dt = 1e-10,force_dtmin =true);

The error message when using the ‘Rodas5’ solver algorithm is

ERROR: BoundsError
Stacktrace:
  [1] _copyto_impl!(dest::Vector{Float64}, doffs::Int64, src::Vector{Float64}, soffs::Int64, n::Int64)
    @ Base ./array.jl:329
  [2] copyto!
    @ ./array.jl:322 [inlined]
  [3] copyto!(dest::Vector{Float64}, src::Vector{Float64})
    @ Base ./array.jl:346
  [4] solve(cache::LinearSolve.LinearCache{SparseArrays.SparseMatrixCSC{Float64, Int64}, Vector{Float64}, Vector{Float64}, SciMLBase.NullParameters, LinearSolve.KLUFactorization, KLU.KLUFactorization{Float64, Int64}, LinearSolve.InvPreconditioner{LinearAlgebra.Diagonal{Float64, Vector{Float64}}}, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Float64}, alg::LinearSolve.KLUFactorization; kwargs::Base.Pairs{Symbol, Float64, Tuple{Symbol}, NamedTuple{(:reltol,), Tuple{Float64}}})
    @ LinearSolve ~/.julia/packages/LinearSolve/FDu3j/src/factorization.jl:296
  [5] #solve#5
    @ ~/.julia/packages/LinearSolve/FDu3j/src/common.jl:143 [inlined]
  [6] #dolinsolve#3
    @ ~/.julia/packages/OrdinaryDiffEq/SmImO/src/misc_utils.jl:105 [inlined]
  [7] perform_step!(integrator::OrdinaryDiffEq.ODEIntegrator{Rodas5{12, true, LinearSolve.KLUFactorization, typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing}, true, Array{Float64, 3}, Nothing, Float64, SciMLBase.NullParameters, Float64, Float64, Float64, Float64, Vector{Array{Float64, 3}}, ODESolution{Float64, 4, Vector{Array{Float64, 3}}, Nothing, Nothing, Vector{Float64}, Vector{Vector{Array{Float64, 3}}}, ODEProblem{Array{Float64, 3}, Tuple{Float64, Float64}, true, SciMLBase.NullParameters, ODEFunction{true, var"#17#19",LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, Rodas5{12, true, LinearSolve.KLUFactorization, typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing}, OrdinaryDiffEq.InterpolationData{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Vector{Array{Float64, 3}}, Vector{Float64}, Vector{Vector{Array{Float64, 3}}}, OrdinaryDiffEq.Rosenbrock5Cache{Array{Float64, 3}, Array{Float64, 3}, Array{Float64, 3}, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, OrdinaryDiffEq.Rodas5Tableau{Float64, Float64}, SciMLBase.TimeGradientWrapper{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool},Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Array{Float64, 3}, SciMLBase.NullParameters}, SciMLBase.UJacobianWrapper{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Float64, SciMLBase.NullParameters}, LinearSolve.LinearCache{SparseArrays.SparseMatrixCSC{Float64, Int64}, Vector{Float64}, Vector{Float64}, SciMLBase.NullParameters, LinearSolve.KLUFactorization, KLU.KLUFactorization{Float64, Int64}, LinearSolve.InvPreconditioner{LinearAlgebra.Diagonal{Float64, Vector{Float64}}}, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Float64}, Nothing, Array{ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}, 3}, Float64, Rodas5{12, true, LinearSolve.KLUFactorization, typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing}}}, DiffEqBase.DEStats}, ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, OrdinaryDiffEq.Rosenbrock5Cache{Array{Float64, 3}, Array{Float64, 3}, Array{Float64, 3}, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, OrdinaryDiffEq.Rodas5Tableau{Float64, Float64}, SciMLBase.TimeGradientWrapper{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Array{Float64, 3}, SciMLBase.NullParameters}, SciMLBase.UJacobianWrapper{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Float64, SciMLBase.NullParameters}, LinearSolve.LinearCache{SparseArrays.SparseMatrixCSC{Float64, Int64}, Vector{Float64}, Vector{Float64}, SciMLBase.NullParameters, LinearSolve.KLUFactorization, KLU.KLUFactorization{Float64, Int64},LinearSolve.InvPreconditioner{LinearAlgebra.Diagonal{Float64, Vector{Float64}}}, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Float64}, Nothing, Array{ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}, 3}, Float64, Rodas5{12, true, LinearSolve.KLUFactorization, typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing}}, OrdinaryDiffEq.DEOptions{Float64, Float64, Float64, Float64, PIController{Rational{Int64}}, typeof(DiffEqBase.ODE_DEFAULT_NORM), typeof(LinearAlgebra.opnorm), Nothing, CallbackSet{Tuple{}, Tuple{}}, typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN), typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE), typeof(DiffEqBase.ODE_DEFAULT_UNSTABLE_CHECK), DataStructures.BinaryHeap{Float64, DataStructures.FasterForward}, DataStructures.BinaryHeap{Float64, DataStructures.FasterForward}, Nothing, Nothing, Float64, Tuple{}, Tuple{}, Tuple{}}, Array{Float64, 3}, Float64, Nothing, OrdinaryDiffEq.DefaultInit}, cache::OrdinaryDiffEq.Rosenbrock5Cache{Array{Float64, 3}, Array{Float64, 3}, Array{Float64, 3}, SparseArrays.SparseMatrixCSC{Float64,Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, OrdinaryDiffEq.Rodas5Tableau{Float64, Float64}, SciMLBase.TimeGradientWrapper{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Array{Float64, 3}, SciMLBase.NullParameters}, SciMLBase.UJacobianWrapper{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Float64, SciMLBase.NullParameters}, LinearSolve.LinearCache{SparseArrays.SparseMatrixCSC{Float64, Int64}, Vector{Float64}, Vector{Float64}, SciMLBase.NullParameters, LinearSolve.KLUFactorization, KLU.KLUFactorization{Float64, Int64}, LinearSolve.InvPreconditioner{LinearAlgebra.Diagonal{Float64, Vector{Float64}}}, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Float64}, Nothing, Array{ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}, 3}, Float64, Rodas5{12, true, LinearSolve.KLUFactorization, typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing}}, repeat_step::Bool)
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/SmImO/src/perform_step/rosenbrock_perform_step.jl:1800
  [8] perform_step!
    @ ~/.julia/packages/OrdinaryDiffEq/SmImO/src/perform_step/rosenbrock_perform_step.jl:1742 [inlined]
  [9] solve!(integrator::OrdinaryDiffEq.ODEIntegrator{Rodas5{12, true, LinearSolve.KLUFactorization, typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing}, true, Array{Float64, 3}, Nothing, Float64, SciMLBase.NullParameters, Float64, Float64, Float64, Float64, Vector{Array{Float64, 3}}, ODESolution{Float64, 4, Vector{Array{Float64, 3}}, Nothing, Nothing, Vector{Float64}, Vector{Vector{Array{Float64, 3}}}, ODEProblem{Array{Float64, 3}, Tuple{Float64, Float64}, true, SciMLBase.NullParameters, ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, Rodas5{12, true, LinearSolve.KLUFactorization,typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing}, OrdinaryDiffEq.InterpolationData{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Vector{Array{Float64, 3}}, Vector{Float64}, Vector{Vector{Array{Float64, 3}}}, OrdinaryDiffEq.Rosenbrock5Cache{Array{Float64, 3}, Array{Float64, 3}, Array{Float64, 3}, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, OrdinaryDiffEq.Rodas5Tableau{Float64, Float64}, SciMLBase.TimeGradientWrapper{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Array{Float64, 3}, SciMLBase.NullParameters}, SciMLBase.UJacobianWrapper{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Float64, SciMLBase.NullParameters}, LinearSolve.LinearCache{SparseArrays.SparseMatrixCSC{Float64, Int64}, Vector{Float64}, Vector{Float64}, SciMLBase.NullParameters, LinearSolve.KLUFactorization, KLU.KLUFactorization{Float64, Int64}, LinearSolve.InvPreconditioner{LinearAlgebra.Diagonal{Float64, Vector{Float64}}}, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Float64}, Nothing, Array{ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}, 3}, Float64, Rodas5{12, true, LinearSolve.KLUFactorization, typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing}}}, DiffEqBase.DEStats}, ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, OrdinaryDiffEq.Rosenbrock5Cache{Array{Float64, 3}, Array{Float64, 3}, Array{Float64, 3}, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, OrdinaryDiffEq.Rodas5Tableau{Float64, Float64}, SciMLBase.TimeGradientWrapper{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Array{Float64, 3}, SciMLBase.NullParameters}, SciMLBase.UJacobianWrapper{ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Float64, SciMLBase.NullParameters}, LinearSolve.LinearCache{SparseArrays.SparseMatrixCSC{Float64, Int64}, Vector{Float64}, Vector{Float64}, SciMLBase.NullParameters, LinearSolve.KLUFactorization, KLU.KLUFactorization{Float64, Int64}, LinearSolve.InvPreconditioner{LinearAlgebra.Diagonal{Float64, Vector{Float64}}}, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Float64}, Nothing, Array{ForwardDiff.Dual{ForwardDiff.Tag{DiffEqBase.OrdinaryDiffEqTag, Float64}, Float64, 1}, 3}, Float64, Rodas5{12, true, LinearSolve.KLUFactorization, typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing}}, OrdinaryDiffEq.DEOptions{Float64, Float64, Float64, Float64, PIController{Rational{Int64}}, typeof(DiffEqBase.ODE_DEFAULT_NORM), typeof(LinearAlgebra.opnorm), Nothing, CallbackSet{Tuple{}, Tuple{}}, typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN), typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE), typeof(DiffEqBase.ODE_DEFAULT_UNSTABLE_CHECK), DataStructures.BinaryHeap{Float64, DataStructures.FasterForward}, DataStructures.BinaryHeap{Float64, DataStructures.FasterForward}, Nothing, Nothing, Float64, Tuple{}, Tuple{}, Tuple{}}, Array{Float64, 3}, Float64, Nothing, OrdinaryDiffEq.DefaultInit})
    @ OrdinaryDiffEq ~/.julia/packages/OrdinaryDiffEq/SmImO/src/solve.jl:475
 [10] #__solve#562
    @ ~/.julia/packages/OrdinaryDiffEq/SmImO/src/solve.jl:5 [inlined]
 [11] #solve_call#24
    @ ~/.julia/packages/DiffEqBase/x9wgy/src/solve.jl:451 [inlined]
 [12] solve_up(prob::ODEProblem{Array{Float64, 3}, Tuple{Float64, Float64}, true, SciMLBase.NullParameters, ODEFunction{true, var"#17#19", LinearAlgebra.UniformScaling{Bool}, Nothing, Nothing, var"#18#20", Nothing, Nothing, SparseArrays.SparseMatrixCSC{Float64, Int64}, SparseArrays.SparseMatrixCSC{Float64, Int64}, Nothing, Nothing, Nothing, Nothing, Nothing, typeof(SciMLBase.DEFAULT_OBSERVED), Nothing, Nothing}, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, SciMLBase.StandardODEProblem}, sensealg::Nothing, u0::Array{Float64,3}, p::SciMLBase.NullParameters, args::Rodas5{0, true, Nothing, typeof(OrdinaryDiffEq.DEFAULT_PRECS), Val{:forward}, true, nothing}; kwargs::Base.Pairs{Symbol, Real, NTuple{7, Symbol}, NamedTuple{(:reltol, :abstol, :maxiters, :save_everystep, :dtmin, :dt, :force_dtmin), Tuple{Float64, Float64, Float64, Bool, Float64, Float64, Bool}}})
    @ DiffEqBase ~/.julia/packages/DiffEqBase/x9wgy/src/solve.jl:814
 [13] #solve#29
    @ ~/.julia/packages/DiffEqBase/x9wgy/src/solve.jl:784 [inlined]
 [14] top-level scope
    @ REPL[54263]:1

My versions are

(@v1.8) pkg> st
Status `~/.julia/environments/v1.8/Project.toml`
  [6e4b80f9] BenchmarkTools v1.3.1
  [336ed68f] CSV v0.10.4
  [35d6a980] ColorSchemes v3.19.0
  [a93c6f00] DataFrames v1.3.4
  [0c46a032] DifferentialEquations v7.2.0
  [5789e2e9] FileIO v1.15.0
  [033835bb] JLD2 v0.4.22
  [961ee093] ModelingToolkit v8.20.0
  [1dea7af3] OrdinaryDiffEq v6.23.0
  [91a5bcdd] Plots v1.31.7
  [10745b16] Statistics

I appreciate any help to fix this issue. The objective here is to start solving the ode error-free with the ‘Rodas5’-solver, and not to completely solve the problem for the full time-span (I’ve simplified the actual physical problem/setup expected to be solved).

Regards,
Luis

Could it be that the right-hand side expects input of dimension Ny x Nx x 3 whereas the Jacobian is for a vector with length Nx * Ny * 3.

Does it work if you add this after the line du = simplify.(du)

u = vec(u)
du = vec(du)

and accordingly u0 = vec(u0)
That way, both functions (right-hand side and jacobian) would use the vectorized version.

The alternative could be to add reshape commands around the call of fjac.

Thank you @SteffenPL for your reply. I provide few more precisions: My optimization is ‘copied’ from the example in the documentation of an earlier version of ModelingToolkit.jl, see this link, and used to work with older versions of Julia, OrdinaryDiffEq and ModelingToolkit. Still, I have tried your changes but the vectorized version resulted in exactly the same error…

Using the current ModelingToolkit documention (link), I am able to run the vectorized form via sys = modelingtoolkitize(prob) and sparseprob = ODEProblem(sys,u0,(0.,1e1),jac=true,sparse=true). However, in previous benchmarks, I determined the Ny x Nx x 3 form to be significantly faster than the vectorized Nx * Ny * 3 for my specific problem (also the new multithreaded version lead to a segmentation fault…). Hence I would like the Ny x Nx x 3 form to work…

I wonder why the example only fails for certain solvers?

A quick answer to the last question is that some solvers do not use the Jacobian. In particular most explicit solvers like Tsit5 do not need this information so they will never call it. Most likely the provided Jacobian function has some dimension error somewhere. (Which is of course always tricky to find :wink: )

1 Like

What if you just let ModelingToolkit do its thing instead of manually using the Symbolics to generate the sparse Jacobian and such? If you just follow the ModelingToolkit documented form, i.e.

prob_jac = ODEProblem(sys, u0, tspan, jac=true,sparse=true)

is it fine? That would narrow it down.

Hi @ChrisRackauckas, thank you for the reply. Using prob_jac = ODEProblem(sys, u0, tspan, jac=true,sparse=true) works, however using prob_jac_Mthread = ODEProblem(sys, u0, tspan, jac=true,sparse=true,parallel = ModelingToolkit.MultithreadedForm()) only worked for the model with (Nx = 20;Ny = 10); solving (Nx=60,Ny=20) caused different TaskFailedException errors such as:

julia> @time solve(prob_jacMthread,Rodas5(),reltol = 1e-8,abstol = 1e-8,maxiters = 1e10,save_everystep = false,dtmin = 1e-20,dt = 1e-10,force_dtmin =true);
ERROR: TaskFailedException
Stacktrace:
  [1] wait
    @ ./task.jl:345 [inlined]
  [2] fetch
    @ ./task.jl:360 [inlined]
  [3] map
    @ ./tuple.jl:224 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/Symbolics/J8IHJ/src/build_function.jl:351 [inlined]
  [5] macro expansion
    @ ~/.julia/packages/SymbolicUtils/qulQp/src/code.jl:351 [inlined]
  [6] macro expansion
    @ ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:129 [inlined]
  [7] macro expansion
    @ ./none:0 [inlined]
  [8] generated_callfunc
    @ ./none:0 [inlined]
  [9] (::RuntimeGeneratedFunctions.RuntimeGeneratedFunction{(:ˍ₋out, :ˍ₋arg1, :ˍ₋arg2, :t), ModelingToolkit.var"#_RGF_ModTag",ModelingToolkit.var"#_RGF_ModTag", (0x8e58222a, 0x6dccd51c, 0xaf0903ef, 0x37e53944, 0xbe3d02b3)})(::SparseArrays.SparseMatrixCSC{Float64, Int64}, ::Vector{Float64}, ::Nothing, ::Float64)
    @ RuntimeGeneratedFunctions ~/.julia/packages/RuntimeGeneratedFunctions/KrkGo/src/RuntimeGeneratedFunctions.jl:117
 [10] _jac
    @ ~/.julia/packages/ModelingToolkit/5qzZn/src/systems/diffeqs/abstractodesystem.jl:295 [inlined]
 [11] calc_J!
    @ ~/.julia/packages/OrdinaryDiffEq/ccwlJ/src/derivative_utils.jl:148 [inlined]
 [12] calc_W!
    @ ~/.julia/packages/OrdinaryDiffEq/ccwlJ/src/derivative_utils.jl:652 [inlined]
 [13] calc_W!
    @ ~/.julia/packages/OrdinaryDiffEq/ccwlJ/src/derivative_utils.jl:599 [inlined]
 [14] ... 

Besides my inability to run the multithreaded version, I have two more concerns when following the current ModelingToolkit documentation:

  • I assume that modelingtoolkitize automatically vectorizes the problem and changes the type of variable u from Array{Float64, 3} to Vector{Float64}; I previously ran benchmarks in which Array{Float64, 3} performed better for my problem and would hence prefer that option (or at least be able to compare both options)
  • I often resubmit the same problem, just changing u0. Running prob_jac = ODEProblem(sys, u0, tspan, jac=true,sparse=true) for larger problems is more time-consuming than computing a sparse jac once and simply providing it to ODEFunction

Can you add checkbounds=true to the ODEProblem? It might be indexing out of bounds in the multithreaded sharded form? That’s my guess.

That’s because of the SIMD you’d get from the Cartesian iteration. The MTK generated code will lose that and will be slower, changing to Array{Float64, 3} will not help that in the current code gen since it would be more about the SLP vectorization pass. For this case, it would be most optimal to build the sparse Jacobian code and add that to the original f, though I wouldn’t be surprised if the difference in the f’s don’t make a major difference in the final ODE time since it’s probably dominated by Jacobian time.

That makes sense. I’m mostly pushing you down this route to reduce the kinds of possible bugs to hone in on what’s actually going on. For your real case, that split approach is a good idea.

It sounds like the issue is the multithreaded sharded form probably “completes” the indexing, so if it splits to 4 parts with 4 threads on 15 ODEs, it indexes 1:16 with even parts in all 4, and errors out with a bounds error (because it should’ve cut the last one early). Right now that’s just a guess though given your last post.

Thanks @ChrisRackauckas for providing the explanations on how different implementations impact the ode solving performance. I’ve tested checkbounds=true in ODEProblem(sys,nothing,(0.,1e1),jac=true,sparse=true,parallel = ModelingToolkit.MultithreadedForm(),checkbounds=true), however it doesn’t solve the issue, and both setups (with or without checkbounds) will error with ERROR: LoadError: TaskFailedException for larger model sizes (i.e. Nx>50;Ny=20 for a single-node job submission with ncpus=8 and memory=96gb on a cluster computer). To determine if the issue is specific to my ODE system, I have run a multi-threaded version of the Brusselator PDE from the ModelingToolkit documentation; it works for N=32, but fails with segfault for N=40 (with or without checkbounds).

Regarding the original problem (using the split approach), I have determined that same setup without multi-threading also results in the original error message ERROR: BoundsError detailed in the first post. Hence, the issue should be to the novel generation of fjac and jac (in the current version of ModelingTookit) rather than the multi-threading part.

I run my simulations for an extended amount of time, so that small gains will make certainly make a difference in longer simulations.

The number of ODEs should be the system size times the degree of freedom per node, i.e. Nx*Ny*3=20*60*3 which then shouldn’t be an issue using export JULIA_NUM_THREADS equal to 2, 4 or 8…

These PDE cases now have a lot more benchmarks, see Brusselator Work-Precision Diagrams · The SciML Benchmarks . It would be nice to capture this one as well.

Hi @ChrisRackauckas, thank you for the update. I can confirm that I am now able to run my ODE system on the latest versions of OrdinaryDiffEq and ModelingToolkit, using an implementation similar to the Brusselator Work-Precision Diagrams. Despite the ‘loss of SIMD’, I observe a speed-up with respect to my previous implementation (as presented in the first post).

Regarding the work-precision diagrams, I first provide information on the type of PDE system:

  • Each element at j,i is characterised by its position u[j,i,1] (-1e1,1e1), velocity u[j,i,2] (1e-15,1e2) and state u[j,i,3] (1e0,1e9). The approximate ranges of the variables are indicated in parentheses.
  • The system specifies the linear interactions between elements, as well as a non-linear resistance term (which in turn describes a velocity weakening effect)
  • For suitable parameter ranges, the model describes a chaotic dynamical system and ‘self-organizes’ into critical states which result in a fractal distribution of (stick and) slip motion over time. (one larger, ‘system-wide’ slip is used for the work-precision benchmark).

Regarding the work-precision benchmark, I have ‘tuned’ it to my needs, to establish which solver is most suitable for my studies.

  • To avoid excessive benchmark times, I focus on a (random) single large slip observed during the chaotic cycle simulations. (If needed, one might also be able to approximate the initial conditions by spatial random fields, so that this benchmark can be reproduced without the full cycle simulations).
  • I specify an error metric based on the maximal difference in the cumulative slip rate, sampled at 1-sec time intervals, error = maximum(abs((sum(u[:,:,2])-sum(u_ref[:,:,2]))/sum(u_ref[:,:,2]))), where u_ref corresponds to the reference solution or appxsol in WorkPrecisionSet (here using CVODE_BDF MTK KLU, abstol = 1e-20, reltol = 1e-15). I then compute the average error from 4 simulation runs.
  • solver tolerances based on any combination of abstols = 1.0 ./ 10.0 .^ (6:2:20) and reltols = 1.0 ./ 10.0 .^ (4:20)

So this is what the benchmark simulation looks like (total spatial slip and velocity in time respectively), for a system using j = 1:18 and i = 1:66 elements:

And here is the Work-Precision Diagram, for the error metric defined previously and a maximum elapsed time of 4h. (I was relying on Rodas5() before this excercise…)

Few additional observations:

  • KLU factorization enables more solvers to complete within the limited time, though no significant speed-ups have been observed between e.g. Rodas5() and Rodas5(linsolve=KLUFactorization())
  • no setup using GMRES succeeded with the simulation
  • no gain from ModelingToolkit.MultithreadedForm()
  • no completion for FBDF,ESERK5,ROCK4,GRK4A,Kvaerno4,Cash4,Hairer4
    (initially tested for abstol = 1.0 ./ 10.0 .^ (12:2:14), reltol = 1.0 ./10.0 .^ (4:2:10)) and CVODE_BDF{:Newton, :GMRES) (initially tested for abstol 1.0 ./ 10.0 .^ (10:2:18), reltol 1.0 ./ 10.0 .^(10:2:18))
  • not enough precision from QNDF

Computer information:

Julia Version 1.8.4
Commit 00177ebc4fc (2022-12-23 21:32 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 256 × AMD EPYC 7742 64-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, znver2)
  Threads: 1 on 256 virtual cores
Environment:
  JULIA_NUM_THREADS = 1

Current versions:

(@v1.8) pkg> st
Status `~/.julia/environments/v1.8/Project.toml`
 [2169fc97] AlgebraicMultigrid v0.5.1
 [fbb218c0] BSON v0.3.6
 [6e4b80f9] BenchmarkTools v1.3.2
⌃ [336ed68f] CSV v0.10.8
 [35d6a980] ColorSchemes v3.20.0
 [a93c6f00] DataFrames v1.4.4
 [f3b72e0c] DiffEqDevTools v2.33.0
 [5789e2e9] FileIO v1.16.0
 [40713840] IncompleteLU v0.2.1
 [033835bb] JLD2 v0.4.29
 [682c06a0] JSON v0.21.3
 [7f56f5a3] LSODA v0.7.2
⌃ [7ed4a6bd] LinearSolve v1.32.1
⌃ [961ee093] ModelingToolkit v8.38.0
 [09606e27] ODEInterfaceDiffEq v3.12.0
⌃ [1dea7af3] OrdinaryDiffEq v6.36.2
 [65888b18] ParameterizedFunctions v5.15.0
⌃ [91a5bcdd] Plots v1.38.0
⌃ [438e738f] PyCall v1.94.1
 [d330b81b] PyPlot v2.11.0
 [c3572dad] Sundials v4.11.4
 [0c5d862f] Symbolics v4.14.0
 [b8865327] UnicodePlots v3.3.2

Just for the record, the pileup where almost all of your solvers are showing 1e-5 (ish) accuracy strongly suggests that your reference solution isn’t accurate enough for the precision you are testing at.

1 Like

This looks pretty great! Could you do a PR to add this to the benchmarking platform so we can play with it some more?

This. Whatever was chosen as the reference solution is definitely not the most accurate solution.

You cannot get a relative tolerance lower than 1e-16 with 64-bit floats. Anything lower is just going to increase the numerical error.

It uses KLU by default, so it shouldn’t be close to the same, it should be the same :sweat_smile:

You likely need to use a preconditioner. Unpreconditioned GMRES is never that great.

That’s interesting. This should be a good case for this. Probably a good test case for optimizing then.

Only FBDF is unexpected there.

Hi, I have created a file describing the model and simulation, similar to what I used for my work-precision benchmark. I’ve added it to this PR (I have little experience with PRs, so I hope I did it correctly). A suitable Work-precision benchmark then needs to be defined and added to the file…

2 Likes

Hey @luismunozheinen , do you have a link to the mathematical description of the system to ease me also benchmarking MethodOfLines against this?

Hi @xtalax, a description of the model (except few modifications) can be found, for example, here. The latter should facilitate the understanding of my implementation.

Thanks this is useful, I don’t recognize these boundary conditions though, what are they - neumann/diriclet/robin?

Edit: Ah, looks like neumann

Actually there are no constraints applied to the boundary blocks and they are free to move along their one spatial degree of freedom (as shown in figure 1 of the linked publication). Fyi, the ‘boundary conditions’ of the actual physical problem are replicated by applying a heterogeneous parameter distribution.