Debugging computer crash during ODE solve

My desktop solves a set of ODEs without issue, but my laptop consistently bluescreens while solving. What is the best way to find the cause of this behavior?

Thanks!

I can’t say I have ever seen this. Is it a specific ODE or all of them?

So far I’ve only seen it with one PDE, and it only happens when I use symbolic_discretize to do a structural simplify and then generate a symbolic jacobian.

if I use:

prob = ModelingToolkit.discretize(pdesys,discretization);
sol = solve(prob, QNDF(), saveat=delta_t);

it solves just fine, but if instead I use:

ode_sys, tspan = symbolic_discretize(pdesys, discretization);
simp_sys = structural_simplify(ode_sys);
ode_prob = ODEProblem(simp_sys, nothing, tspan, jac=true, sparse=true);
solve(ode_prob, QNDF(), saveat=delta_t)

it bluescreens my laptop.

I’m very interested in doing the work to find the problem, but I am having a hard time figuring out how to do post-mortem on Julia after a hard crash.

My suspicion right now is that it might have something to do with this warning I get about Symbolics with my equation structure, but I haven’t been able to make it work at all with the new syntax.

Warning: The variable syntax (u[1:n])(..) is deprecated. Use (u(..))[1:n] instead.
│                   The former creates an array of functions, while the latter creates an array valued function.
│                   The deprecated syntax will cause an error in the next major release of Symbolics.
│                   This change will facilitate better implementation of various features of Symbolics.

Try solve(ode_prob, QNDF(autodiff=false), saveat=delta_t). I wonder if it’s just a giant expression so the autodiff is taking a bunch of memory (and thus the comple time is big too)

I just tried it and it still crashed with autodiff=false. Looking at memory usage on my laptop vs desktop, the laptop does fill my RAM both with and without disabling autodiff, but the desktop doesn’t go past a few gigabytes. I have 16 GB on my laptop and 32 on my desktop, so I wouldn’t have expected either to have problems.

During run or compilation?

I’m sorry, I’m not sure how to check that.

How big of a discretization?

If this is MethodOfLines related, I’d like to see your system if you can share.

The discretization creates a set of ODEs with 606 equations, and after structural_simplify, I am left with 600 equations for a discretization along a single axis in space plus time.

I am using MethodOfLines, and I will check with my PI to see if I can share the code.

Can you try passing the kwargs through MOLFiniteDifference with discretize? Like MOLFiniteDifference([x => dx], t, jac = true, sparse = true), that should be equivalent to what you are doing - structural_simplify is run automatically when you call discretize.

Code has been shared privately and I have run a brief analysis of memory usage. Here is an excerpt of the code, showing the discretization and solves:

# Method of lines discretization
discretization = MOLFiniteDifference([x=>delta_x],t; approx_order=order)
discretization2 = MOLFiniteDifference([x=>delta_x],t; approx_order=order, jac = true, sparse = true)

println("Time for discretization:")
@time  prob = ModelingToolkit.discretize(pdesys,discretization);
@time  prob2 = ModelingToolkit.discretize(pdesys,discretization2);

println("Time for solving:")

@time sol = solve(prob, QNDF(), saveat=delta_t, reltol=1e-8, abstol=1e-8);

println("new method")
@time sol2 = solve(prob2, QNDF(), saveat=delta_t, reltol = 1e-8, abstol = 1e-8) # Massive Allocation

I have verified that the above construction of prob2 is indistinguishable from the other method of problem construction @johnb detailed previously.
Memory usage is remaining sensible (under 5GiB) until the solve of the “new method” shown here (bear in mind a solve without jac=true, sparse=true has been run previously), where memory shoots up to 37.5GiB, apparently during compile. This occurs with or without autodiff=false. I suspect this high memory usage is causing the crash, and my intuition tells me that this is occurring during compilation of the jacobian. I would be interested to know what @ChrisRackauckas thinks.
After the solves, RSS drops to 36.9GiB, but a look at the size of all variables seen in the snippet shows them all to be on the order of a few tens of MB each, so I don’t know to what this memory is allocated.

This may be memory fragmentation, and related to similar results I had when reading very large CSV files from the census:

With this issue filed:

If I’m reading that issue right, it sounds like that was a Linux and maybe macOS issue. Both of my machines are on Windows, but I’m not sure how Windows deals with memory allocations.

Yes, that is completely unused when analytical Jacobians are given.

Yes, and we knew this could happen. We need to do the Jacobian codegen in a different way.

Is this related to these issues in Symbolics.jl?

compile time sparse derivatives · Issue #788 · JuliaSymbolics/Symbolics.jl · GitHub

Yup same issue.

Found an older issue that seems related as well. I guess this has been a known issue for a while, but I’m having trouble finding any breadcrumbs for a workaround. Am I missing something, or is this just a limitation for some systems?

John - This seems to be a limitation of the jacobian code generation, when that jacobian code is being generated from certain types of PDE code. This is likely a structural thing with the way jacobian code generation is implemented at the moment, so the implementation will need to be fixed first. This sort of code generation is a relatively new approach , so people are still trying to iron the problems out - but rest assured that this is being looked at, and that more attention is on it now.

Is there any way that in the meantime that you could approach your problem without automatic jacobian generation, or more memory? Or is this a hard limitation?

I apologize, I didn’t intend to sound like I was trying to push too hard for a quick fix! I can definitely use more traditional strategies to solve my problem, and it’ll give me more breaks anyway. :slight_smile:

Thank you for your help in troubleshooting, and for the insight into how the packages work together.