DifferentialEquations Package Kills Performance Everywhere when TruncatedStacktraces is used

After loading the DifferentialEquations.jl package in the REPL, Julia seems to start performing very slowly, possibly only when text is output to the REPL, but in a way that makes the REPL extremely frustrating to use. Starting from a fresh Julia REPL session, I perform the following operations:

julia> @time a = [1,2,3,4]
  0.000009 seconds (1 allocation: 96 bytes)
4-element Vector{Int64}:
 1
 2
 3
 4

julia> using DifferentialEquations

julia> @time a = [1,2,3,4]
  0.000003 seconds (1 allocation: 96 bytes)
4-element Vector{Int64}:
 1
 2
 3
 4

julia> 

The first @time a = [1,2,3,4] is nearly instantaneous, as I would expect. But after using DifferentialEquations (which itself takes 30-60 seconds), nearly 10 seconds elapse between hitting enter after @time a = [1,2,3,4] and when I am able to type in the REPL again.

This is especially bad when an exception occurs. The stacktrace output can take over a minute to finish.

Why is the inclusion of the DifferentialEquations package so dramatically affecting the performance of seemingly unrelated operations, including those in Base? I have observed this behavior on two computers.

4 Likes

This is because DifferentialEquations is overloading type printing to make stacktraces better. It forces any code that might show a type to recompile (and you really aren’t supposed to do things like this because it also breaks things like help mode in the repl for types that have the overloaded show methods. @ChrisRackauckas

9 Likes

A PR to Base fixes this with an experimental feature that we’re already setup to use:

Sadly, it’s not being merged, even as a marked experimental feature. I mentioned this here:

I don’t understand why we cannot have this as an experimental feature in v1.9, even if it’s removed later we have everything safely defined:

So if this was merged for v1.9.0 and removed shortly after then we would be okay. But without getting some buy-in with Base, there’s literally no tool to solve what we need to solve.

That said, the TruncatedStacktraces.jl README is very clear that you can disable all of the stacktrace truncation with just a line of code:

using Preferences, UUIDs

using TruncatedStacktraces
Preferences.set_preferences!(TruncatedStacktraces, "disable" => true)

# OR if you don't want to load TruncatedStacktraces.jl

Preferences.set_preferences!(UUID("781d530d-4396-4725-bb49-402e4bee1e77"), "disable" => true)

So take your pick: either stack traces are super long and no longer solved, or you get invalidations. The answer of getting short stack traces and no invalidations just requires a custom Julia build (using the branch I linked) or begging someone to just merge something into Base Julia.

4 Likes

Thanks! Is this new behavior? I’m surprised such a popular package has such an impactful issue. If possible I would like to rollback to a previous version where this behavior does not occur.

You don’t have to. Literally just set the preference.

using Preferences, UUIDs
Preferences.set_preferences!(UUID("781d530d-4396-4725-bb49-402e4bee1e77"), "disable" => true)
3 Likes

That won’t fix it. The problem isn’t the abreviated stack trace, but the redefinition of type showing. For example SciMLBase.jl/callbacks.jl at 587cfcc9cf4277b35d3a0a2b08a21d5c9467bf8f · SciML/SciMLBase.jl · GitHub means that anyone who loads SciMLBase will have this issue.

That makes no sense. The redefinition of type showing only exists because the stack trace silencing hasn’t merged in Base like we were expected it to. So yes, if it merges then we just change TruncatedStacktraces.@truncate_stacktrace VectorContinuousCallback to match the Base stack trace silencing and it all goes away. We’re waiting and hoping this merges for v1.9 or something soon.

The problem is that by overloading type showing, you invalidate any code that shows a type which includes package loading, a bunch of the repl, and docstrings.

Yes, and so then it all goes away when you set the preference because they are statically not defined.

The method I’m complaining about is in SciMLBase. It gets defined even if TruncatedStackTraces.jl isn’t loaded.

That was an oversight fixed in Missed a show -> truncatedstacktraces macro by ChrisRackauckas · Pull Request #423 · SciML/SciMLBase.jl · GitHub

1 Like

Totally! Why isn’t this disruptive behavior opt-in?

It should be the other way around: default behavior is “don’t change anything”, with the option to “enable” stacktraces rewriting.

And the “disable” setting doesn’t seem to work for me: even after Preferences.set_preferences!(UUID("781d530d-4396-4725-bb49-402e4bee1e77"), "disable" => true) stracktraces mention that package:

Some of the types have been truncated in the stacktrace for improved reading. To emit complete information
in the stack trace, evaluate `TruncatedStacktraces.VERBOSE[] = true` and re-run the code.

This message isn’t even correct: nothing is actually truncated.

9 Likes

Dare I suggest another solution to the SciML stacktrace problem? Don’t define types where 10 out the 13 fields are parameterized, like this one:

struct ODESolution{T, N, uType, uType2, DType, tType, rateType, P, A, IType, S,
                   AC <: Union{Nothing, Vector{Int}}} <:
       AbstractODESolution{T, N, uType}
    u::uType
    u_analytic::uType2
    errors::DType
    t::tType
    k::rateType
    prob::P
    alg::A
    interp::IType
    dense::Bool
    tslocation::Int
    stats::S
    alg_choice::AC
    retcode::ReturnCode.T
end

In my opinion, this is the real reason why SciML stacktraces are hard to read.

1 Like

What is the alternative here? The types likely do need to be quite generic since there are many options to choose from, but just leaving the types as ::Any wouldn’t be so performant which is what the point of much of these types is for.

I’m no expert on SciML internals, but the alternative to parameterizing every field is to choose the types that you want to use for your internal code and work a little harder in your constructors to convert user input into the concrete data types that your code will actually use. This is made even easier by the implicit conversions done by default constructors.

Another alternative is to leave the fields as Any (or an appropriate abstract type) and use function barriers. I would guess that most of the performance critical parts of SciML are behind function barriers anyways.

1 Like

I am quite certain that less parameterization is not a feasible option.

To give some examples, even if only u would be parameterized, it can be a pretty long expression:
Types which could appear here for example Unitful quantities, Duals for automatic differentiation, some kind of ComponentArray for better access to the solution, CuArrays for GPU computation, Intervals for uncertainty measurements or Nums for symbolic nightmares ;). This is just a list of types I have used already, I’m sure there are many more. Note that many of these cases are actually really different types with different arithmetic, so, you cannot just convert it to an Array{Float, N}.

The problem why Any as a uType is not good, is that you don’t want a runtime dispatch in each time step. Moreover, you might want to forward ODESolution (or other SciML internal types like integrator) to many functions, so, it is not only one function barrier but many more. The alternative would be to then forward only u with its concrete type, but then again stack traces get long again.

1 Like

Ok, that explains one of the parameters. What about the other 9? I could certainly be wrong, but my intuition says there’s a way to redesign the code base that doesn’t rely so heavily on type parameters. That wasn’t even the most egregious example. Consider these:

mutable struct DEOptions{absType, relType, QT, tType, Controller, F1, F2, F3, F4, F5, F6,
                         F7, tstopsType, discType, ECType, SType, MI, tcache, savecache,
                         disccache}
# ...
end
mutable struct ODEIntegrator{algType <: Union{OrdinaryDiffEqAlgorithm, DAEAlgorithm}, IIP,
                             uType, duType, tType, pType, eigenType, EEstT, QT, tdirType,
                             ksEltype, SolType, F, CacheType, O, FSALType, EventErrorType,
                             CallbackCacheType, IA} <:
               DiffEqBase.AbstractODEIntegrator{algType, IIP, uType, tType}

I can’t even count how many type parameters that is.

Furthermore, many of those type parameters are set to Nothing, which you can see in SciML stacktraces. Why have 20 parameters, 15 of which are Nothing?

1 Like

absType and relType are the type of the absolute and relative errors respectively. These can be either Floats if you want an L2 norm or arrays if you want component-wise norms. prob is the problem type which can be a wide variety of things. alg is the type of the algorithm that was used (there are about 200 options here). stats could possibly be made to be a concrete type, etc.

One good example is maybe Parallel Ensemble Simulations · DifferentialEquations.jl where you essentially run the whole thing millions of times… Even is a single solve is fast, you cannot afford much runtime dispatch in such a setup.

But just to be sure, I get the point that the design could be maybe be less greedy. But that’s not really in the nature of that project.

Look at this chunk from a SciML stacktrace:

Notice that ~90% of the parameters are Nothing. If 90% of your fields are nothing, then that indicates that you could develop types that better capture your domain logic and are not 90% empty.

Again, I’m not a SciML developer, but to me this all points to an opportunity to rethink the design of SciML types. But take it with a grain of salt. I’ll get off my soap box now.

3 Likes