Setting `abstol` and `reltol` when solving the Schrödinger equation with OrdinaryDiffEq

Note: This topic was originally title “DiffEqOrdinaryDiffEq fails to preserve norm in simple Schrödinger equation”, which was solved by setting reltol. The more interesting part of the discussion is how to set abstol and reltol when simulating the dynamics of the quantum system with and ODE solver.


I’m having an issue with OrdinaryDiffEq producing obviously wrong results (up to me doing something stupid, i.e., using it wrong). Consider the following MWE:

using OrdinaryDiffEq: OrdinaryDiffEq as ODE, ODEProblem, solve as ode_solve
using Plots

function f_ode(u, p, t)
	Δ = 1.0
	α = 4.6
	X = -1im * Float64[
         0.5 * α * t          Δ
               Δ       -0.5 * α * t
    ]
    return X * u
end

Ψ = [-0.999998+0.0im, 0.0021739+0.0im]  # eigenstate at t = -100

tlist = collect(range(-100.0, 100; length=1001))

ode_problem_direct = ODEProblem(ODE.ODEFunction(f_ode), Ψ, (tlist[begin], tlist[end]))

ode_sol_direct = ode_solve(ode_problem_direct, ODE.Tsit5(); abstol=1e-12, saveat=tlist)

popdyn_ode_direct = reduce(hcat, map(Ψ -> abs2.(Ψ), ode_sol_direct.u))

plot(tlist, popdyn_ode_direct'; label=["|⟨Ψ(t)|ϕ₀⟩|²" "|⟨Ψ(t)|ϕ₁⟩|²"], legend=:right, title="Projection onto canonical basis states", xlabel="time", ylabel="population")

That should be solving

\frac{\partial}{\partial t}|\Psi\rangle = -i H |\Psi\rangle

with the Hermitian

H = \begin{pmatrix} \frac{\alpha}{2} t & \Delta \\ \Delta & -\frac{\alpha}{2} t \end{pmatrix}\,.

Whatever the actual dynamics are, because H is Hermitian, the resulting dynamics should be unitary, i.e., this should be preserving the norm of |\Psi\rangle. The MWE does not:

This works fine using my own solver, which gives me this correct-looking result:

What am I doing wrong here with using OrdinaryDiffEq?

1 Like

The most immediate fix is to lower the relative tolerance:

ode_sol_direct = ode_solve(ode_problem_direct, ODE.Tsit5(); abstol=1e-12, reltol = 1e-10, saveat=tlist)

Typically, although I don’t think it affects the accuracy, there’s also no need to include the saveat = tlist. You can get get the values of the solution by interpolation by calling ode_sol_direct(t) after you’ve solved it.

This is not my area, but Tsit5() is probably not the best choice for solver for this problem, check the “Probelm Types” heading in the left bar of DifferentialEquations.jl: Efficient Differential Equation Solving in Julia · DifferentialEquations.jl . I don’t remember Quantum Mechanics stuff well enough to figure out if any of those types would match up with your system.

3 Likes

Standard RK methods are not norm preserving. As mentioned above, you can lower the tolerances and that should work for sufficiently short integrations. If you have long time integrations, you can use the symplectic Runge-Kutta methods or the methods designed for non-autonomous linear systems.

4 Likes

Indeed, I found while playing around with this that the tstops here actually hurt the accuracy, and asking for an abstol also hurt accuray (rather than a reltol).

In addition, Vern8() tends to be a more accurate solver for this sort of problem since it’s symplectic and thus will keep the Hamiltonian on the symplectic manifold, conserving various properties. (edit: you can of course do better with the sorts of solvers Chris mentioned above, but Vern8 does better than Tsit5 here and is easier to switch out).

Here’s what I see for the total norm of the wave-function over time after making some adjustments:

using OrdinaryDiffEq: OrdinaryDiffEq as ODE, ODEProblem, solve as ode_solve
using Plots

function f_ode(u, p, t)
	Δ = 1.0
	α = 4.6
	H = Hermitian([
        0.5 * α * t   Δ
        Δ            -0.5 * α * t
    ])
    -im * (H * u)
end

function tst(;tlist = range(-100, 100; length=1001), alg=ODE.Vern8())

    Ψ₀ = [-0.999998+0.0im, 0.0021739+0.0im]  # eigenstate at t = -100

    ode_problem_direct = ODEProblem(f_ode, Ψ₀, (tlist[begin], tlist[end]))

    Ψ = ode_solve(ode_problem_direct, alg;
                  reltol=1e-10)
    norms = map(t -> norm(Ψ(t)), tlist)
    plot(tlist, norms;
         label="⟨Ψ⟩²",
         legend=:right, title="Projection onto canonical basis states", xlabel="time", ylabel="population")

end 

which I think is a pretty impressive level of norm-conservation.

If it becomes a problem beyond that though, you can do tricks like using callbacks to re-normalize your solution.

3 Likes

Indeed, reltol fixes the problem. I don’t think I have a good grasp on setting abstol and reltol reliably. The way I’d usually think about errors in propagators would for the norm(Ψ - Ψ_exact) < err, where Ψ is the result of the propagator at final time T, Ψ_exact is the hypothetical analytic solution (in practice, obtained from a solver with all precisions cranked up to the max), and err is the error I’d like to prescribe (typically something like 1e-12). It’s a little non-ideal that there are two error parameters abstol, reltol that I have to tweak, and I have to get a better understanding of how to connect my err with abstol and reltol.

I suppose the key is the equation given in the “Basic step size control” of the CommonSolve.Solve docstring, but I have to have a closer look (Unless someone with more experience can give me some guidelines :wink: )

there’s also no need to include the saveat = tlist

Yeah, but there are reasons I need the states exactly on that time grid. But this is really just for the sake of the MWE / comparing with my normal solver. In my actual code, I use the integrator interface. But I do have to work on a specific time grid (because that’s how gradients in quantum control work).

This is not my area, but Tsit5() is probably not the best choice for solver for this problem

That’s interesting… I’m very much not an expert in general ODE solver methods. But to my knowledge when people have “traditionally” used ODE solvers for quantum dynamics (as an upgrade from something low-precision like Crank-Nicholson), they’ve typically used RK45. So I just sort of assumed Tsit5 would be the equivalent-but-better default.

The title of the post may be a bit misleading, in that I care about “norm-preserving” only in so far as I care about “correct” in terms of the aforementioned norm(Ψ - Ψ_exact) < err, and that I know Ψ_exact to have norm 1. It’s not necessarily a problem that errors are reflected in a deviation of the norm. At least that’s pretty easy to detect, and thus possibly preferable to “grossly wrong, but with the correct norm”.

That’s not to say that I wouldn’t want to try solvers that inherently norm-preserving. My “workhorse propagator” of expanding exp(-i*H*dt) in Chebychev polynomials (the “correct” solution in my OP) is norm-preserving, after all. But that propagator is piecewise-constant, and the reason I ran into this whole issue was that I wanted to double-check with OrdinaryDiffEq that I wasn’t running into time-discretization errors. That is, compare with the solution for the time-continuous (adaptive time step) ODE solver. Which then, with the ill-advised approach of using OrdinaryDiffEq as a black box, gave me wildly inaccurate results.

That’s weird, and a little concerning. But ultimately, I’m really just going to have to nail down how to reliably choose abstol and reltol based on my more general notion of “error”, and then see what is a good default ODE method for quantum dynamics.

This particular ODE might also be a bit pathological. I’m actually probing here a parameter regime where the adiabatic Landau-Zener transition (where everything would evolve very smoothly) breaks down. Still, it would be good to have a reliable ODE solver in my toolbox that doesn’t require much handholding.

I’ll definitely have a look at Vern8 and the other solvers Chris mentioned.

Thanks!

I was curious and tried this out using Magnus solvers and fixed time grid as you mentioned you needed that, and it seems to have performed very well:

using OrdinaryDiffEq, SciMLOperators, Plots

function tst2(;)
    tlist = range(-100, 100; length=1001)
    
    Ψ₀ = [-0.999998+0.0im, 0.0021739+0.0im]
    
    function update_func!(W, u, p, t)
        Δ = 1.0
	    α = 4.6
        W[1,1] = -im*(0.5 * α * t)
        W[2,1] = -im*(Δ)
        W[1,2] = -im*(Δ)
        W[2,2] = +im*(0.5 * α * t)
    end
    W = MatrixOperator(zeros(ComplexF64, 2, 2); update_func!) # -im * H
    
    prob = ODEProblem(W, Ψ₀, (first(tlist), last(tlist)))
    
    Ψ = solve(prob, MagnusMidpoint(), dt=step(tlist), saveat=tlist)

    popdyn_ode_direct = reduce(hcat, map(t -> abs2.(Ψ(t)), tlist))
    p = plot(tlist, popdyn_ode_direct'; label=["|⟨Ψ(t)|ϕ₀⟩|²" "|⟨Ψ(t)|ϕ₁⟩|²"], legend=:right, title="Projection onto canonical basis states", xlabel="time", ylabel="population")
    plot!(tlist, norm.(Ψ.(tlist)); label="|⟨Ψ(t)|Ψ(t)⟩|")
    p
end

Here’s a zoom-in on the drift of |⟨Ψ(t)|Ψ(t)⟩| and it is very precise:

This solver is also extremely fast so that’s nice (8x faster than Tsit5() for this example)

3 Likes

So the key point about the 2 tolerances is that the step gets accepted if either passes. The easiest way to use this intuitiively is to set reltol=0 and then control your tolerance purely through abstol. The reason this isn’t the default is that abstol isn’t scale invariant so it can only be set well by the user. If you are looking for a more general tolerance setting, you should control your tolerance via reltol and set abstol to some value small enough that it’s able to be considered the same as 0.

3 Likes

If the user specifies a positive abstol but not a reltol, it would be nice if the reltol were set to zero automatically. (This is how Base.isapprox works.)

2 Likes

Firstly, both abstol and reltol are stepwise, not global. They bound the acceptable error for a single time step, but larger errors may accumulate over the course of the simulation.

For each step, abstol is a tolerance on the next value of norm(Ψ - Ψ_exact) conditional on the current Ψ being exact, while reltol is a tolerance on norm(Ψ - Ψ_exact) / norm(Ψ). Of course, you don’t actually know Ψ_exact, but if anything the error estimates lean toward the pessimistic side, so this is usually quite robust. (Specifically, for an order n algorithm the error is estimated by comparing the proposed step to one computed using an order n - 1 algorithm. This can really be thought of as a heuristic error estimate for the order n - 1 algorithm, and should thus be an overestimate for the order n algorithm unless something pathological is going on.)

abstol specifies what zero means to you in absolute terms, while reltol specifies the number of accurate digits you demand. If you know the typical magnitude of your state—say, the typical or maximum value of norm(Ψ) during the simulation—it’s natural to set abstol = scale * reltol. I think matlab ode solvers can even do this automatically: you specify a reltol, and the integrator keeps track of the largest state encountered and updates abstol accordingly. However, if the norm of your states is fairly constant, at least in order of magnitude terms, you might as well only specify reltol while keeping abstol = 0.

For quantum states where norm(Ψ) = 1, the distinction doesn’t really matter and abstol and reltol are interchangeable. In this case, I’d set abstol to zero and experiment with reltol.

1 Like

Your H is not Hermitian, unless alpha is purely imaginary.

I think X = -iH in the code.

Still not OK, because Delta is real.

H is real-symmetric. I think you are mis-reading something, @photor.

Yes, exactly. Which entails H= H^{\dagger}, that is, H_{ij} = H_{ji}^* = H_{ji}. And then U = \exp(-i H dt) is unitary, which is why the time evolution should be norm-preserving. That part, as the standard Schrödinger equation (and U being the standard time evolution operator), is definitely not what’s causing any problems :wink:

The Schrödinger equation is

i{\partial \psi \over \partial t} = H \psi

In ODEProblem, you need to provide an f such that

{\partial \psi \over \partial t} = f(\psi, p, t)

In order to do that, he needs f(\psi, p, t) = -i H(t) \psi. The Hamiltonian is Hermitian here, but a Hamiltonian alone is technically not the generator of time translations, it’s actually the anti-Hermitian -i H that generates them.

This is a pretty general thing in Lie algebras, where physicists like to talk about some Hermitian operator, but actually the thing doing the real work on the manifold is anti-Hermitian.

2 Likes

This discussion is extremely helpful, but let me try to pull apart what we mean by “norm” and “magnitude” in these situations. The docstring of CmmonSolve.Solve has that equation that (if I read it correctly) says a step is accepted if

\frac{\text{err}}{a + \bar{u} \, r} < 1

where \text{err} is an (absolute) local error estimate (e.g., the comparison to the higher order of the solver, I presume), a is my abstol, and r is my reltol. The \bar{u} is described as

\bar{u} = \max(u_{\text{prev}}, u)

I’m not 100% sure what that means. I’m guessing u_{\text{prev}} is just the value from the previous time step, and accounts for if u goes from e.g. 1 to 1e-10 in a time step, 1 is still the relevant scale. But maybe let’s ignore that and read this as \bar{u} = |u|.

Now, I get the impression that this equation is elementwise. That is, u is each component of my complex state vector |\Psi\rangle (or, more precisely, every real and imaginary part of each component). Which would mean that

isn’t quite right for norm being the standard Euclidian norm of the entire vector |\Psi\rangle. I think it also means that the recommendation to set either reltol or abstol, with the other set to 0 might need a little more nuance.

Let’s stick to state vectors |\Psi\rangle from unitary quantum mechanics. That means norm(Ψ) is always going to be 1, but each real and imaginary part of each component of Ψ is going to be some number between -1 and 1. It would be fairly common for one component being exactly 1 and the others exactly zero, or for a handful of components to have a magnitude on the order of 10^{-1} or 10^{-2} and the other components on the order of 10^{-14} (floating point noise). I would care about the (relative) error of the larger components, but obviously I wouldn’t care about the floating point noise.

If I’m thinking through this correctly, in this situation, if I set abstol=0, and reltol=1e-10, I would demand a relative error of 10 significant digits even on the floating point noise, which seems like it would be much too high a precision and only blow up my runtime. On the other hand, if I set abstol=1e-10 and reltol=0, I’m not quite nailing down the precision of the larger components that I actually care about: There might just be a single component with value 1.0, or the state might be a superposition of 50 basis states, so 50 components with amplitudes 1e-2 – and an absolute error of 10^{-2} on 1.0 vs. 0.01 is obviously a different relative error. That would probably still be manageable, so

seems like not a bad recommendation. Still, I think the best approach (which would go straight into the defaults of the ODE backend of QuantumPropagators) would be to use abstol=1e-14 to deal with the floating point noise (don’t care about anything approaching machine precision), and reltol=1e-8 for the number of significant digits in the larger components that I actually care about. The reltol might be something that I can vary depending on the use case, with smaller values if I really want to know the solution “to machine precision” or a larger value if I just need an approximate solution. I would generally then also consider that reltol to be a reasonably good measure (up to an extra digit or two) for the norm(Ψ - Ψ_exact) at the end of the propagation, while accepting the caveat that

It would be great if someone can confirm that the “step error condition” is in fact element-wise.

still seems like a major foot-gun. Specifically, what I ran into was setting only abstol, but then the default value of reltol being the thing that actually dominates the simulation (and the default reltol being too small for this kind of problem).

very much seems like something that would avoid this footgun. But for now, I think the lesson is to always set both abstol and reltol. Either with reltol=0 and some abstol, or (better), by setting reltol to the desired relative error and abstol always to 1e-14 (or, generally, whatever “zero” is for the coefficients of the vector).

P.S.: I think I might rename this topic to “Setting abstol and reltor when solving the Schrödinger equation with OrdinaryDiffEq”, because that’s really what ended up being the crux of the matter.

Also, for those of you with an interest in quantum physics, as a tidbit for context, this is the notebook I was working on when a ran into this problem: a little review of what happens numerically when going from “adiabatic” to “non-adiabatic” in the dynamics of an avoided (Landau-Zener) crossing: https://goerz-research.github.io/2025-02-04_Landau_Zener/LandauZener.html

1 Like

Looks like that matches the implementation: DiffEqBase.jl/src/calculate_residuals.jl at 6e6e63ba5424469f320e5dd84dde9d6fb0d6ee3a · SciML/DiffEqBase.jl · GitHub

It might be nice to have the option of norm-wise relative tolerance in addition to element-wise relative tolerance. But maybe that’s already in there … it’s a vast library that I don’t know well.

1 Like

seems like not a bad recommendation. Still, I think the best approach (which would go straight into the defaults of the ODE backend of QuantumPropagators) would be to use abstol=1e-14 to deal with the floating point noise (don’t care about anything approaching machine precision)

This is a very common misonception. Float64 goes all the way down to 1e-308. It’s just that your relative precision is only 10^16. If you’re working with circuits for example, it is fairly normal for your currents to have values around 1e-12 at which point, you definitely want an abstol for those variables far below 1e-14

1 Like

Yes… I’ve certainly run into problems on occasion with the misconception that “machine precision” always means looking for absolute errors on the order of the “machine epsilon”. The “always” abstol=1e-14 was meant in the context of simulating the dynamics of quantum state vectors, where the relevant amplitudes are guaranteed to be on the order of 10^{-2} to 1. In other contexts, abstol would be whatever can be considered “zero” relative to the amplitude of the “signal”.

1 Like

I genuinely thought it was norm-wise the way I wrote, but looking closer it’s actually a hybrid approach. First it calculates a scaled elementwise error:

\epsilon_i = \frac{\left|u_i^{(n)} - u_i^{(n - 1)}\right|}{a_i + r_i \max\left(\left|u_i^{(n)}\right|, \left|u_i^{(n - 1)}\right|\right)} \, .

The final error estimate is then some norm of this elementwise error vector; the default is the Hairer norm, which is a 2-norm scaled by the length of the state (I think this is used because DifferentialEquations allows you to change the length of your state vector during the evolution).

\epsilon_\text{total} = \left\| \boldsymbol{\epsilon} \right\|_H = \sqrt{\frac{1}{N} \sum_{i = 1}^N \epsilon_i^2} \, .

The step is accepted if \epsilon_\text{total} < 1.

Note that abstol and reltol are denoted as a_i and r_i, with subscripts, because you’re allowed to pass vector-valued tolerances to set different tolerances for each component.

This is my best interpretation of Timestepping Method Descriptions · DifferentialEquations.jl.

This does indeed mean that if abstol=0, the relative error of each component is given equal weight regardless of how the magnitude of that component compares to the overall norm of the state. I’m surprised. If this is correct, you certainly have to think more carefully about abstol and reltol than I claimed above. Though for normalized quantum states I suppose setting reltol=0 and abstol=1e-(<desired number of digits in the overall norm>) should do the trick, since with reltol=0 and scalar abstol, the fact that the elementwise error vector is scaled before taking the norm rather than after doesn’t make any difference.

@ChrisRackauckas can you confirm?