MTK libaries & initial values vs. initial guesses?

In MTK libraries, should I include “hard-coded” initial values in components, or not?

I’m toying with an alpha version of a library based on MTK, and have now come to the stage of testing whether I can linearize the completed model… To find the operating point, I simulate the system to “steady state”, then grab the final values of the unknowns, and use these as operating point in the linearization attempt.

In my components, I have “hard-coded” initial values for some variables just to minimize the number of initial variables a user would have to specify. I don’t know whether this is good practice.

Questions:

  1. Is it ok to “hard-code” initial values for “states” in the components, to minimize the amount of initialization the user has to do? Or is it better to just use “guess” values in the components?
  2. When I have “hard-coded” initial values for variables that are not “unknowns”, and I then specify unknowns as initial values, I seem to get error messages (or warnings?) that the initialization is over-determined. How do I get around these warnings/error messages?
  • I can set the variables that I have hard-coded in the component to nothing, but that doesn’t work well for vector variables. And it is also adding to the burden of the user to know which ones have been hard-coded…
  • Is there a command to remove all “hard-coded” variables in the model, so that I can start with the known steady state unknowns?

OK, I hope my “problem” is explained such that it is understandable.

Ancient wisdom teaches to avoid “magic numbers.” I’m unsure, with you, as to whether that should apply in your example, but it is common practice to define functions with default values. Would that work?

For some “states”, one can have good guesses of initial “states” (in the meaning of the minimal information that is required to specify the system simulation). In a simple case, I can have:

\frac{\mathrm{d}x}{\mathrm{d}t} = f(x;t)

where I specify x(t=0) in the component. What I mean by “hard-code” is:

var @variable begin
    ...
    x(t)=x_0
    ...
end

where I may specify x_0 prior to these statements in the component, or transfer x_0 as an argument in the function call.

The alternative would be to not “hard-code” the initial value, but instead give a guess value:

var @variable begin
    ...
    x(t), [guess=x_0]
    ...
end

The “problem” is that the compiled system has a number of first order differential equations and a number of algebraic equations. To have a proper initialization, one normally needs to specify (“hard-code”) the same number of values as there are differential equations, while for the remaining, there must be initial guesses (unless the algebraic equations are linear).

So I have to some degree used “hard-coding” of initial values (x(t) = x_0) instead of guess values (x(t), [guess = x_o]) so that the user does not have to specify initial values in the formulation of the ODEProblem (ODEProblem(csys, [x => x_0],...).

The down side to this is that the values I have “hard-coded” may not be among the unknowns of the system. In other words, if I specify initial values to be the steady state values of the unknowns ( ODEProblem(csys, [unk .=> x_ss],...)), I may overspecify the initialization.

I didn’t mean to suggest magic number in its pejorative sense and I should have given some more thought to the specific problem.

An initial value for a differential state is a degree of freedom, not an arbitrary literal, and for a consistent DAE initialization you pin exactly as many as there are differential equations and guess the rest. The maxim I was reaching for is narrower than I put it: surface meaningful values as named, overridable quantities rather than inline literals — which is what a named x_0 plus [guess = x_0] already does.

On the overspecification worry, I’d lean on the defaults-vs-initialization-equations distinction. A pinned value attached as a default on the variable is silently overridden when the user passes u0, so it won’t fight a later [unk .=> x_ss]. A value written as an initialization equation adds a constraint, and that’s what overdetermines the system. So for “works out of the box but stays overridable,” keep the pinned values as defaults plus guesses for the algebraic unknowns, and reserve initialization equations for genuinely structural constraints. A user who actually wants to start at steady state is better served solving for it (SteadyState/Nonlinear) than hand-specifying every unknown.

I don’t quite get what you mean by “pinned value”. As far as I know, writing:

... x(t) = 1

provides a default value for x(t=0), but this also works as an initialization equations, which can be over-ridden by [x => 2].

If I instead write,

...x(t), [guess=1]

this is instead treated as an initial guess in a nonlinear solving process, if needed.

And if I then write [x => 1], this overrides the guess, and makes it into an initialization equation instead.

In principle, I think I need to have as many initial equations as there are (first order) differential equations. And I should have as many guesses as there are algebraic constraints in the compiled equations.

Problem is if I specify default initial values (x(t)=1) on a variable that is not an “unknown” but instead an observed variable, and then add constraints in the ODEProblem ([unk .=> x_ss]) I may end up overspecifying the initialization problem. Most likely the initial value that is specified for the observed variable is not compatible with the steady state solution x_ss, so I will get contradictions.

It is possible that MTK then will try to find a least squares solution, I don’t know.

Alternatively, if I avoid the problem by only providing “guess” values in the component (x(t), [guess = 1.0]) I will likely end up having an underspecified initialization problem. Again, how is that treated? If I only have guess values and no initialization equations, perhaps the solver will use least squares and be able to find a workable set of initial values for the unknowns.

In the latter case: specifying “guesses” only in the components, this makes it much simpler to re-start the system from a steady state [unkn .=> x_ss] because the system will then not be overspecifed.

Anyways, if your suggested pinned value is something else that solves my problem, I am eager to learn.

You are right and I am wrong, about “pinned” as about so much else. A value attached to a variable (whether as a default/binding or passed as [x => 1] to the problem) is turned into a constraint equation in the generated initialization system. A [guess = 1] is only a starting point for the nonli;near solve and adds no constraint. My “pinned” idea invented a default that constrains nothing, which doesn’t exist.

Have you looked at the tutorial in the Documenter? There’s a very thorough looking treatment of initialization and what happens with overdetermination. I’m not going to hazard a summary, but you might find more clarity there. (This is definitely NOT a RTFM for you, although it should have been for me)

It is.

Overspecifying can be fine if all of those constraints are valid. The issue I find with people doing this though is people going “oh, I just put that as a default for one model, but now I made another model and I wanted it to find a good x!!!”. If you set x ~ 1 for initialization, it will make sure that’s true! So only put equations that you know you want to force to be true, otherwise you will have issues.

If you don’t always want that equation to be true, use a guess.

OK, so:

  1. If the system is over specified wrt. initial constraints, then a least squares solution is found for the initialization – that is how I interpret CR’s answer.
  2. If the system is under specified, but has sufficient number of guess values, will similarly a least squares initialization be used but based on the guess values? If so, that is a good argument for avoiding specifying initial values (what I call “hard-coded”)…
  3. Often a least squares initialization is ok. But there may be cases where you do not want, say, negative values (e.g., negative absolute pressure, negative absolute temperature). Can I then add a qualifier that says the solution should be, say, positive?
  4. When I create the numeric problem (constructor: ODEProblem), the initialization problem is solved. And I am told whether the initialization succeeded or not (sometimes I get…
Warning: The initialization system is structurally singular. Guess values may significantly affect the initial values of the ODE. The problematic variables are SymbolicUtils.BasicSymbolicImpl.var"typeof(BasicSymbolicImpl)"{SymReal}[valve₊a₊md(t)].
│ 
│ Note that the identification of problematic variables is a best-effort heuristic.

I don’t know if that should scare me… It still works, though.)

  1. If I remake the problem, is then the initialization problem solved, or is simply the unknowns used with an assumption of consistency?

I guess that yes once this will be solved ? Add bounds generation to NonlinearProblems · Issue #4308 · SciML/ModelingToolkit.jl · GitHub

  1. Yes
  2. Yes
  3. Not yet, but numerically it’s possible we just need to expose it in MTK.
  4. You can turn off the warning, but it’s there because most of the time not being well-conditioned is a mistake. But note that no, the initialization is done at solve, not at ODEProblem.
  5. It’s still only solved at solve, so it’ll happen at the next solve call.

Yes exactly why I say not yet. NonlinearSolve.jl already exposes it, MTK has the metadata, it just doesn’t pass that metadata to the solver. Probably Claude could one-shot that. Maybe I can just spawn that off now and see.

To my Point 4:

OK, my “confusion” in asking whether initialization is done in constructor ODEProblem is that this constructor responds with (in one of my cases):

ODEProblem with uType Vector{Float64} and tType Float64. In-place: true
Initialization status: FULLY_DETERMINED
Non-trivial mass matrix: true
timespan: (0.0, 80.0)
u0: 10-element Vector{Float64}:
 849.9992116249999
 850.0140866249999
   0.48391164659730024
  63.610204044770704
 116.98481298770999
   0.0
   0.0
   0.0
   0.0
   0.0

So the constructor explicitly states that the initialization status is fully determined, and provides initial values (?) for the unknowns.

Perhaps I misunderstand this? Perhaps the statement is that initialization can be found, but this is not done yet and will be done by the solver?

As to my Point 5:

Clearly, when the system starts to move from the specified initial unknowns, the algebraic constraints must be satisfied, and a solver must be used to ensure consistency.

But this stepping procedure is not what I mean by “solving the initialization problem” – in this stepping, the solution of the algebraic equations can not use the initialization “guess” values (from the component) as initial iteration value in the algebraic solver, I assume.

From what I understand by your answers, this “consistency” checking/initialization solving is also done by the solver at initial time, and therefore the “hard-coded” initial constraints matter also for the solver?

Initial structure. Not the actual initial values if MTK generated.

It’s used in the initialization solve.

I have a lingering question about MTK’s =. I find this confusing from tutorial:

@parameters τ = 3.0 # parameters and their values
@variables x(t) = 0.0 # dependent variables and their initial conditions

It requires = to mean different things: τ is always 3 and x(t) not always zero. (And the comment shouldn’t be needed for clarity.) What’s the mental model for understanding the equals sign, and why couldn’t the dependent variable separated from the initial condition, e.g. @variables x(t) [x(0)=0] to separate the concerns?

I do understand there are longer and clearer alternatives, e.g. initial_conditions = [x => 1, p => 1]) or prob = ODEProblem(pend, [x => 1, y => 0, g => 1]. I am specifically confused by the equals sign in short form @variables x(t) = 0.0.

Personally, I don’t find this “double” use of equality too confusing.

OK, the term “parameter” is used differently in different fields. E.g., in “distributed parameter system” vs. in more mathematical oriented fields where “parameter” more or less is equivalent to a constant value. In Modelica, a distinction is made between “constant” which is a natural or mathematical constant (e.g., pi, Planck’s constant, etc.) and “parameter” which is a quantity that has constant value, e.g., radius of a pipe, etc.

So, to me, tau = 0 makes perfectly sense for a “parameter”.

Also, the very word “variable” clearly means something that varies, so x(t)=0.0 should not be interpreted as something that is constant for all time. And if it does not mean something that is constant for all time, it makes sense that it is the value at a particular “initial” time.

But perhaps my mind has been wired to think like this; maybe this is not natural for other people.

We only “know” that because we looked up the tutorial, or guessed because it’s least insensible.

Any math professor unfamiliar with Julia can usually understand the code. In plain Julia x(t) = 0 pretty much means x(t)=0. Normally x(0) = x_0 is needed to restrict when equality applies.

In MTK you use @variables x(t) to express both dependent and independent variables, great. Other info usually goes in square brackets: description, state_priority, connect, input, bounds, nominal, guess, disturbance, tunable, irreducible. The naive user can parse these reasonably well.

But @variables x(t) = 0 forces me to override my natural instincts. Good syntax expresses intent, but this obscures intent and can’t be interpreted literally.

I wish it were clearer, like @mtkcompile sys = System(… initial_conditions = [x => 1, p => 1]) which the math professor would understand. Still better would be x(0) =>1 since why not allow the user to specify a boundary condition anywhere? It’s just a convention to start time at zero.

It’s now solved after a few releases go out.

it probably ought to be an option “initial_val” that goes with all the other options like “state_priority”, “guess” etc.

I’ve been thinking about this and haven’t been able to figure out why this choice was made, though it was clearly made quite intentionally. The guesses keyword completely changes its meaning depending on whether or not the initialization problem is underdetermined. In the fully determined case, guesses is just the starting point x_0 for the nonlinear solver, which converges to whichever solution x_*' it happens to converge to and there is no guarantee that |x_0 - x_*'|_2 \leq |x_0 - x_*|_2 for all valid solutions x_*. But for the underdetermined case, the initialization turns into an optimization problem where that inequality is enforced, even though selection of appropriate x_*' could easily be left up to the solver.

See this example for how this may (for me) unexpectedly affect initialization:

# Use the same guesses dict for both problems
guesses = [x => 0.3]

# Fully determined system with nonlinear algebraic constraint
eqs0 = [D(y) ~ x
        0 ~ x * (x - 1) * (x + 1)] # Roots are x=-1.0, x=0.0 and x=1.0
@mtkcompile test0 = System(eqs0, t)
prob0 = ODEProblem(test0, [y => 0.0], (0.0, 1.0); guesses = guesses)
sol0 = solve(prob0, Rodas5P())
# -> Initializes x to 1.0 even though 0.0 would be closer

# Underdetermined system with the same algebraic constraint
eqs1 = [D(y) ~ z
        D(z) ~ 0
        0 ~ x * (x - 1) * (x + 1)] # Roots are x=-1.0, x=0.0 and x=1.0
@mtkcompile test1 = System(eqs1, t)
prob1 = ODEProblem(test1, [y => 0.0], (0.0, 1.0); guesses = guesses)
sol1 = solve(prob1, Rodas5P())
# -> Initializes x to 0.0

This seems inconsistent/confusing and like a potential performance sink for underdetermined systems where states without explicit initial conditions might simply not matter that much. Solving a nonlinear optimization problem for initialization seems like overkill then.
Can anybody point me to the rationale behind this behaviour? Is there a keyword to opt out?

It’s just the initial condition in both cases.