Recently ported a Matlab implementation over to Julia. My RHS function evaluates faster in Julia, which is good, but the vast majority of Julia solvers seem to require far more function evaluations. CVODE_BDF outperforms ode15s, but QNDF and others all require an order of magnitude more function evals, which I found surprising?

My problem is of the form

A inv(B) C dX = K

where matrices A, B, C and RHS K are all functions of state X. I solve this using

dX = (A * (B \ C)) \ K

which is passed to the solver. Itâ€™s quite a costly RHS function.

Function evals:
CVODE_BDF: 128
Matlab ode15s: 408
ddebdf (fortran solver in julia): 1383
QNDF: 5604
RadauIIA5 (best native Julia solver): 2547

I have also tried with some other solvers:

My question might be a bit too vague, I guess I was just surprised that the older solvers that are not as feature rich (CVODE_BDF) seem to be more efficienct for my system than any of the newer solvers? Should I not worry about it and just go ahead and use CVODE_BDF? Or is there something else going on, and maybe I should place more trust into the Julia solvers despite them being less efficienct?

I was doubly surprised since QNDF is supposed to be quite similar to ode15s, but they seem to be handling my system very differently given the number of function evals. (I have checked matlab and julia functions agree, and solvers converge when using small tolerances).

How many jacobians are used? For big problems, the jacobians often matter at least as much as the function calls. Also, have you checked that the error being achieved is the same? I.e. when you say reltol=1e-5 the accuracy wonâ€™t be the same for all the solvers.

Comparing CVODE_BDF and QNDF with abstol=1e-6, reltol=1e-3 to QNDF with abstol=1e-12, reltol=1e-8, they both appear to be similar in accuracy. Maximum difference from low tolerance solution both around 1e-5. (Feel free to correct me if this is not a good way of measuring accuracy)

Edit: and just checking with matlab ode15s, also order 1e-5 but slightly bigger error at 1.45e-5, as opposed to 1.06e-5 for QNDF and 1.27e-5 for CVODE_BDF.

(percentage errors around 5 and 6 %)

CVODE_BDF creates 3 Jacobians, QNDF creates 80.

Sorry, I feel like my problem is likely too general to really tackle, as I understand that specific solver performance is system dependent and I havenâ€™t really given much information about my system. Itâ€™s not something easily shareable because matrices A, B, C are non-trivially calculated from the state vector X, rather than in any nice analytical formâ€¦

Solver stats if interesting:

CVODE_BDF()

Number of function 1 evaluations: 128
Number of function 2 evaluations: 0
Number of W matrix evaluations: 17
Number of linear solves: 0
Number of Jacobians created: 3
Number of nonlinear solver iterations: 125
Number of nonlinear solver convergence failures: 0
Number of rootfind condition calls: 0
Number of accepted steps: 106
Number of rejected steps: 0

QNDF()

Number of function 1 evaluations: 5604
Number of function 2 evaluations: 0
Number of W matrix evaluations: 80
Number of linear solves: 162
Number of Jacobians created: 80
Number of nonlinear solver iterations: 162
Number of nonlinear solver convergence failures: 0
Number of rootfind condition calls: 0
Number of accepted steps: 79
Number of rejected steps: 1

Can you use a work-precision diagram on the time series error? There are still some differences in the methods, such as how CVODE uses an I-controller while FBDF uses a PI-controller. This means that with the same tolerance we generally see FBDF about an order of magnitude more accurate, with that being very dependent on the ODE solver. So to turn any statement of performance itâ€™s best to also take into account accuracy simultaneously, hence the work-precision diagrams in SciMLBenchmarks.jl: Benchmarks for Scientific Machine Learning (SciML) and Equation Solvers Â· The SciML Benchmarks.

while the error measured over the whole time series:

Notice any differences? While in many cases the error measurement doesnâ€™t make too much of a difference, in some cases we have noticed some of the older BDF methods (specifically lsoda and CVODE) can get quite different results given the implementation of the interpolation and other factors combined.

So without a WPD on the time series the answer is kind of .

It would be nice to get our hands on this case and see what is causing the difference though. And see if swapping the time stepping choice to a PController makes a noticable difference here. If you could please share the ODE we can play with it and see whatâ€™s causing the difference.
\

QNDF is more like ode15s, and FBDF is more like CVODE_BDF. You should probably compare FBDF and CVODE in these cases as though two should in theory be rather close.

Thanks for the reply Chris, I used a test_sol from FBDF with abstol and reltol set to 10^-14, and followed the code given in POLLU Work-Precision Diagrams Â· The SciML Benchmarks. I wasnâ€™t exactly sure how to specify end-point vs over time, but here is what I got:

So it seems like CVODE_BDF is performing quite well here over all tolerances, and better than FBDF.

Running FBDF with solve option â€ścontroller = OrdinaryDiffEq.PredictiveController()â€ť doesnâ€™t seem to change anything (destats are identical), so Iâ€™m not sure if maybe Iâ€™m not specifying that option correctly?
Iâ€™m running the following to change the controller:

Hey where did the code go? I think I have a fix and I think itâ€™s related to finite differencing. First off, why was it using autodiff=false? But secondly, I am curious if itâ€™s only different in the case where autodiff=false and diff_type === Val{:forward}.

Hi sorry, didnâ€™t realise you were working on this and I felt a bit weird having my crappy julia code out in public!

Itâ€™s back here:

I am using autodiff=false partly because I didnâ€™t write my code to be compatible with dual numbers initially, and partly I did a test a while ago where I did put in effort to get some similar code autodiff compatible and didnâ€™t notice any speedups. Although I should probably put some time in and change my code so I can test with autodiff on

Just did some basic tests with changing diff_type and donâ€™t seem to notice any difference between forward, central and backward.

We assume the out-of-place form is only used for StaticArrays as an optimization. It lacks optimizations like Jacobian reuses that only makes sense for larger problems. To get better performance, you should rewrite your RHS function to be in-place.

Hi, thank you for your reply, this was very helpful and seems to be the main issue.

Simply creating a mock in-place function: fun = (du,X,p,t) -> du .= dynamics_combined(X, p, t) takes FBDF(autodiff=false) from 12 s to 0.7 s runtime, and 7000 to 413 function evals.

I might do another work-precision diagram just to see how well Julia algs do against CVODE using this interesting fix, but for now just some simple timings for a test case:

FBDF: 12.089 s
FBDF with mock in-place function: 735.099 ms
CVODE_BDF: 634.817 ms

Edit: and for a slightly more complex case, FBDF is now outperforming CVODE, so crisis averted, faith restored and thank you everyone for your help.

The original motivation was because of Zygote in solver support would require it. But Zygote is now so far off that it doesnâ€™t really matter. We should consider doing Jacobian reuse on the out of place solvers.