Is Julia reliable for solving ordinary and stochastic differential equations?

soldin · May 5, 2023, 8:30am

I got a bit worried about using Julia after reading Yuri Vishnevsky’s article (“Why I no longer recommend Julia”). I want to use this language for solving ordinary and stochastic differential equations. Besides this, I also want to use random number generator (this would already be embedded in the stochastic DE solver). Kindly, let me know if Julia is reliable for at least these tasks? That is, are those correctness and composability issues not present at all in the tasks mentioned above?

P.S.: I know C++, Matlab, and R. Matlab is too slow for solving ODEs (R would have a similar speed, haven’t tried). I could not get sufficient demo examples on how to use an ODE solver (say, Dopri5) in C++.

barucden · May 5, 2023, 9:19am

The article started a lot of discussions and many opinions have been expressed on this matter. I am sure you can find them.

You are essentially asking whether Julia and a set of libraries written in Julia are bug-free.

EDIT: I should add that the Julian ecosystem around differential equations is one of the most mature ones, and you should expect no issues using it. See SciML Scientific Machine Learning Showcase

soldin · May 5, 2023, 11:05am

Thanks for the reply. Yes, I was asking if SciML/DifferentialEquations.jl (and random number generator) is bug free?

nilshg · May 5, 2023, 11:13am

The answer to that is no - but I’m afraid the answer for any modestly complex piece of software will be no, e.g. here is the list of bug fixes for the next MATLAB release:

https://uk.mathworks.com/support/faq/pr_bugs.html

This is just a fact of life so you will have to think about ways to deal with it, which I presume mostly will revolve around testing results you’re producing against things you know to be true.

soldin · May 5, 2023, 11:21am

Thanks for the reply. Couldn’t believe that even Matlab could have so many bugs. In this respect, my question could be seen as [In comparison with Matlab] is Julia reliable…

Krastanov · May 5, 2023, 11:48am

Yes, Julia has one of the most advanced ODE and SDE framework, including both capabilities and reliability.

However, it is extremely important to always program defensively (write a ton of assertions and consistency checks) no matter the language or framework you use. You already saw the example about Matlab above. I have used (and contributed to some of) matlab, mathematica, maple, sympy, numpy, scipy, tensorflow, and pytorch – all of them occasionally have had terrible bugs and I would have wasted weeks or even published scientifically wrong results if I was not in the habit of programming defensively. Julia is truly not any more or less stable when averaging over a workload and its pitfalls (e.g. it has a younger ecosystem, but some of the ecosystem is better integrated, some parts of the ecosystem are in heavy development, but some are better than anything else in existence, it has fewer packages, but the package management and reproducibility story is better, etc – plenty pros and cons when you do an in-depth comparison).

ChrisRackauckas · May 5, 2023, 12:08pm

Every software has bugs. That said, our solvers have one of the most active developer communities and a very strong bug tracker and test suite. See:

which tracks known issues (note that most of the issues are “new algorithm”, i.e. algorithms we are looking to implement). Along with our issue tracking, we have a test suite which currently takes around 25-30 hours to run which tests all sorts of numerical things. See for example:

That’s just for ODEs, and that’s a much larger test suite than what I can find in all of the other open source ODE solvers combined! Everything from convergence rates to high precision arithmetic (and convergence with high precision arithmetic) etc. are tracked and documented.

SDEs then have their own full set of tens of hours of tests:

(though some started timing out because we are trying to do very high precision weak convergence testing with GPUs where the original paper required an HPC, long story but we’re buying new hardware for that).

But I think the right way to say it is, the SciML organization has a very big history of mixing integration testing with downstream testing, including integrating tests from proprietary software like Pumas that’s used in locked down clinical settings (https://pumas.ai/), and over time this monster of a test set has become pretty robust.

I think the robustness of the test set is best explained with an anecdote. For the development of DP5, we wanted a version that would exactly match the Fortran dopri5 code (and same with DP8 with dop853). When doing the regression testing, we found we can only match if we set the inital dt… so something was up. When we dug around, we realized that the original dopri5 code actually has a bug in it, where it’s missing a /length(u) in the second norm calculation and thus makes the init dt guess dependent on the size of the ODE (i.e. if you have the same ODE and keep repeating it, you do not get a constant init dt guess but instead it converges to zero). This of course is in comparison to the formula stated in Hairer’s book, and the one in dop853, so we reported it and confirmed that it is the case that the 1970’s dopri5 everyone wraps did have a bug in it. It’s pretty harmless, but it can’t pass our tests haha.

Another fun anecdote is that the C wrapped lsoda implementation that many use (scipy, R) is actually 1-based. In order to ensure the calculation it gives back is correct, it allocates its arrays with one slot extra and then returns an array with the pointer shifted one over. This is an artifact of the fact that it’s a Fortran translation:

github.com

sdwfrost/liblsoda/blob/master/src/lsoda.c#L248


      
          		opt->hmxi = 1. / opt->hmax;
          	if (opt->hmin < 0.) {
          		fprintf(stderr, "[lsoda] hmin < 0.\n");
          		return 0;
          	}
          	return 1;
          }
          /* allocate memory with one malloc request and ensure compatibility with the solver's way of
           * accessing memory.
           *
           * only elements starting from 1 are useful in the solver, the 0th elements are allocated anyways
           *
           * the reason only 1 malloc is called is because malloc with threading of such large
           * block size may be slow. I wanted to minimize the overhead of starting the solver.
           * */
          static int alloc_mem(struct lsoda_context_t * ctx) {
          	int nyh = ctx->neq;
          	int lenyh = 1 + max(ctx->opt->mxordn, ctx->opt->mxords);
          	long offset = 0;
          	int i;
          	long yhoff = offset;

This is why it has some issues with some BLAS wrappers. We found this out when doing performance tests of QNDF and building the wrapper GitHub - rveltz/LSODA.jl: A julia interface to the LSODA solver, finding that it was allocating more memory than it should.

So is our code bug free? No, but what we are doing is making sure it gets better everyday and are open about every issue we have. In fact, it’s probably easier to write such a blog post about SciML tools than most other tools simply because we are open and will tell you what tests are failing. If you want to write an inflammatory blog post that bugs exists, here’s the materials right here! Issues · SciML/OrdinaryDiffEq.jl · GitHub . However, the fact that it’s very difficult to write such a blog post about the current known bugs with MATLAB’s solvers is not indicative of it not having bugs, it’s indicative of the current issues not clearly communicated.

Also, as you can see from the tracker, most of the issues have a clear exit or warn.

soldin · May 6, 2023, 9:27am

Thanks. Yes, this is why I’ve decided to write the same code in Matlab also, as a way of comparing.

soldin · May 6, 2023, 9:31am

Thanks for such a detailed explanation.

Eben60 · May 6, 2023, 10:26am

The very first bug of that list is quite impressive

Aerospace Toolbox [2748178]
In certain cases, pointAt method of satellite scenario Satellite class interprets Euler angle inputs in radians rather than degrees

Krastanov · May 6, 2023, 1:26pm

You might know this, but putting it out here: Writing to separate implementations in two different languages is (1) a lot of work and (2) it usually uncovers superficial typo-like errors, not modeling errors. It is extremely valuable to put in some more fundamental checks (and that is frequently less work): e.g. check that energy or other invariants are conserved, check simple cases with analytical solutions, check reproducibility and statistical properties of the solutions, etc.

Palli · May 6, 2023, 3:50pm

I see from last month:
2023: MIT Center for Computational Science & Engineering MathWorks Prize for Outstanding Master’s Research in Computational Science & Engineering. For Songchen Tan (MIT Julia Lab) for his work on TaylorDiff.jl for use NeuralPDE.jl physics-informed neural networks (PINNs)

So it’s a Prize (also) in the name of MathWorks, i.e. the maker of MATLAB. The SciML ecosystem if for sure doing something (well man things) right. The only language (of those) potentially competing on speed with Julia is C++. The others not unless you can wrap fast libraries from other fast languages, and that can be done for e.g. BLAS, but I understand more challenging for solving equations.

I believe all the issues Yuri brought up, are solvable, if not already, and the only really interesting intriguing issue he brought up that time was OffsetArrays.jl, that I very much doubt is an issue for solving your equations:

There is a lot of problems with OffsetArrays.jl but most other languages do not even something like OffsetArrays.jl or the expectation that most code written would automatically work with custom indexes.

I (still) have full confidence in Julia core developers and e.g. Chris, for math/technical computing, such as you bring up. Julia however isn’t bug-free (no software ever will be), but issues are handled well.

Yuri no longer has any open PR in Julia, all of his 11 have been merged or closed. But he also has open issues, including 2 from 2 weeks ago, rather interesting:

Issues · JuliaLang/julia · GitHub [after clicking the link the + needs to be changed to a space; I’m not sure how to put in a correct link for Discource, it breaks if I do it, so I’m not sure it’s possible…]

Note, I think all integer and floating-point division is correct in Julia for all types (handled by CPU instructions, i.e. assuming correctly), from his issue, it’s about floored division (e.g. div)

github.com/JuliaLang/julia

Incorrect results from floating-point division (`fld`, `cld`, `div`)

opened 12:38AM - 21 Apr 23 UTC

yurivish

bug maths float16 correctness bug ⚠

Julia has a [`div`](https://docs.julialang.org/en/v1/base/math/#Base.div) functi…on that is used to implement [floor division](https://docs.julialang.org/en/v1/base/math/#Base.fld) (`fld`) and [ceil division](https://docs.julialang.org/en/v1/base/math/#Base.cld) (`cld`). I think I’ve found a bug that causes all of these division functions to return incorrect results. Floor division is documented to return the largest integer less than or equal to `x / y`. It should never return a number that is *greater* than `x / y`. But it does: ```julia julia> 514 / Float16(0.75) Float16(685.5) julia> div(514, Float16(0.75)) # should round down, but rounds up instead Float16(686.0) julia> fld(514, Float16(0.75)) # likewise Float16(686.0) julia> fld(514, Float16(0.75)) ≤ 514 / Float16(0.75) false ``` Similarly, ceil division should never return a number that is *smaller* than regular division, but it does: ```julia julia> 515 / Float16(0.75) Float16(686.5) julia> cld(515, Float16(0.75)) # should round up, but rounds down instead Float16(686.0) julia> cld(515, Float16(0.75)) ≥ 515 / Float16(0.75) false ``` This behavior is not limited to 16-bit floats. Here’s a case where `fld` produces an incorrect result for `Float32` inputs: ```julia julia> 4_194_307 / Float32(0.75) # = 5592409.5 5.5924095f6 julia> fld(4_194_307, Float32(0.75)) # = 5592410, incorrectly rounded up 5.59241f6 julia> fld(4_194_307, Float32(0.75)) ≤ 4_194_307 / Float32(0.75) false ``` And here’s the same for `cld`: ```julia julia> 4_194_308 / Float32(0.75) # = 5592410.5 julia> cld(4_194_308, Float32(0.75)) # = 5592410, incorrectly rounded down 5.59241f6 julia> cld(4_194_308, Float32(0.75)) ≥ 4_194_308 / Float32(0.75) false ``` The equivalent operations in Python produce the correct results: ```python # For 16-bit floats: >>> np.float16(514) / np.float16(0.75) # regular division 685.5 >>> np.float16(514) // np.float16(0.75) # floor division 685.0 # For 32-bit floats: >>> np.float32(4_194_307) / np.float32(0.75) # regular division 5592409.5 >>> np.float32(4_194_307) // np.float32(0.75) # floor division 5592409.0 ``` Examples of this incorrect behavior are not hard to find – for most floats, you can find a divisor that will make either `fld` or `cld` return the wrong answer. Here are some examples for `Float16` where either `fld` or `cld` is incorrect: - `cld(1, Float16(0.000999)) < 1 / Float16(0.000999)` - `cld(2, Float16(0.001999)) < 2 / Float16(0.001999)` - `cld(3, Float16(0.002934)) < 3 / Float16(0.002934)` - `cld(4, Float16(0.003998)) < 4 / Float16(0.003998)` - `fld(5, Float16(0.004925)) > 5 / Float16(0.004925)` And here are some for `Float32`: - `fld(5, Float32(6.556511e-7)) > 5 / Float32(6.556511e-7)` - `fld(10, Float32(1.3113022e-6)) > 10 / Float32(1.3113022e-6)` - `fld(11, Float32(1.4305115e-6)) > 11 / Float32(1.4305115e-6)` - `cld(16, Float32(2.8014183e-6)) < 16 / Float32(2.8014183e-6)` - `cld(17, Float32(2.2053719e-6)) < 17 / Float32(2.2053719e-6)` For simplicity I’ve presented examples where the first argument is an integer; this bug also occurs for non-integral inputs. A divisor producing the wrong result can be found for over 51% of all possible 16-bit floats. I have not evaluated how widespread this is for `Float32`, but the results above suggest that it is similarly easy to find failures there too. I’ve tracked the invocations down to [this definition](https://github.com/JuliaLang/julia/blob/72aec423c2ab9f80c249d63fdd68b35833cfd7ed/base/div.jl#L370) of `div`, which has existed at least as far back as [Julia 1.6](https://github.com/JuliaLang/julia/blob/f9720dc2ebd6cd9e3086365f281e62506444ef37/base/div.jl#L279): ```julia div(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} = convert(T, round((x - rem(x, y, r)) / y)) ```

That error rate suggests that this bug manifests once out of every ~1,700 randomly chosen Float16 divisions
[…]
Based on a billion samples, the bug manifests once out of every ~10,000 randomly chosen Float32 divisions and once out of every ~74,000 Float64 divisions.

Division (and e.g. square root) CAN be correct, in general, can only be correctly rounded, so you always have a small error. According to chaos theory, small errors can blow up. I don’t think this is too worrying, i.e. to have a tiny bit smaller error, than the correctly rounded result. If you’re sensitive to it, then most likely the small error in the correctly rounded error too?

I’m curious, for differential equations, you can have arbitrary operators, e.g. division common, but would you have floored division often (or ever), or div? If not, then you don’t need to worry about that issue at least.

Since Yuri was posting new issues recently, he at least cares about Julia, and likely still uses, or why would he even have know or posted those two weeks ago.

soldin · May 10, 2023, 2:22pm

Thanks for the input. Yes, I do believe that eventually all the concerns raised by Yuri will get sorted out. I wouldn’t have wasted a few months on Matlab and C++ if I had not got disillusioned by Yuri’s article (though I believe that his criticism turned out to be highly valuable for the language) and this need of knowing a few guidelines for writing efficient Julia codes, which otherwise can give much slower performance than, say, Python.

Topic		Replies	Views
JuliaDiffEq Vern9() solver taking too long General Usage question	18	1990	August 27, 2019
Next_step! resulting in slightly different results for the same inputs Modelling & Simulations	13	129	September 23, 2024
ODE solvers - why is Matlab ode45 uncannily stable? Numerics ode	33	6402	June 19, 2021
New to Julia, using it for dynamic systems simulations Modelling & Simulations diffeq , physics	14	1583	March 24, 2021
Handling Instability When Solving ODE Problems Modelling & Simulations	14	32832	August 23, 2024

Related topics