Differential Equations and Parameter Estimation

I’m about to attempt parameter estimation using DifferentialEquations, and have a couple of questions – I hope someone can share their experience. I’ve read the tutorial, and some previous discussion.

  • My problem is a single PDE of how temperature in a tank varies with some boundary inputs (influent temperature, some heating power, some volumetric flow rates, some control inputs). I have discretized the PDE using the finite difference method.
  • I have data consisting of 5 inputs (“manipulated variables”) and 2 temperatures outputs sampled every minute.
  • One of the sampled output temperatures T_\mathrm{s} is a combination of an influent temperature T_\mathrm{i} and the (unknown) effluent temperature T_\mathrm{e}; T_\mathrm{s} = u_\mathrm{v} T_\mathrm{e} + (1-u_\mathrm{v})T_\mathrm{i} where u_\mathrm{v} is a control input (a split range valve setting).
  • Because of the form of measurement/data T_\mathrm{s} is not a state, I assume that it is best to pose the model as a DAE so that I can have T_\mathrm{s} as a solution – ODEs can only have states as a solution, I guess.
  • I will probably use 2 hyper parameters in the model fitting: (i) the number n of compartments in the discretized model, and (ii) the weight \lambda of the regularization term.
  • I noted Chris’ response to alewolf on how to choose which variables in the DAE solver are used in the loss function (save_idxs argument).

My questions are the following…

  1. My input data are available at a discrete time interval (1 min). I can handle this in two ways:
    a. Set up a for loop, and solve the DAE over 1 minute with constant input, passing the final result as initial time to the next iteration in the for loop, and let the cost function be the sum of each sub sum, or
    b. Create an interpolation function for the input data so that the DAE solver knows the value at all times.
    Which is recommended, (a) or (b)?
  2. In general, initial states will be unknown, and I assume that I should pose these as unknown parameters in the model fitting? (n compartments…)
  3. The regularization term – it looks like the standard is to add the squared 2-norm of the parameter vector. If I have some idea of what the parameters should be… which is best?
    a. Add bounds for the parameters in the optimization problem, or
    b. Formulate the regularization term to be the norm of the deviation of the parameter from some initial guess?
    c. Combining the above, i.e., combine (a) and (b)?
  4. If my validation set is not immediately following in time my training set, the initial states of the validation set will be unknown. How to I handle this problem?
  • Use state estimation for the validation problem?
  • Use state estimation for both the validation problem and the training problem for assessing hyper parameters?
  1. If I vary my two hyper parameters simultaneously (n and \lambda) and compare the RMSE of both the training set and the validation set to choose the hyper parameters, I get a 2D problem. Should I tackle one at a time?

the tank is agitated? you can have some boundaries using a CSTR and a PFR results as limits

Neither. Just solve one DAE with saveat=1, and then use the result array against all data points at once.

Yes, just make them functions of parameters being fit

(a) is what’s usually done, (b) is a little weird but is in some sense related to putting a Gaussian prior distribution and doing MAP.

Your choice.

Or use the gradient.

1 Like

The tank model describes stratification in the temperature, so the temperature is distributed.

Thanks, Chris. My first question was perhaps not very clear. My model is typically:

  • \frac{dx}{dt} = f(x,u;\theta), with sampled outputs from y=g(x,u;\theta).
  • I have input-output pairs (u_i,y_i),\quad i\in \{1,\ldots,N_d\}, thus N_d data points sampled at 1 min interval.
  • The model solution x(t) is influenced by the input u(t)\in \mathbb{R}^5, and so is the output y(t)\in \mathbb{R}^2.
  • Because I don’t observe the states directly, but instead the mapped y variable, I think I need a DAE formulation instead of an ODE formulation.
  • Critical element of question: Because u(t) influences the solution, I assume that I need to either…
    i. Make an interpolant for u_i so that the DAE has u(t) available, or
    ii. Make a for loop for solving the model in the interval [t_i,t_{i+1}) with constant input u_i, and then let the solution x(t_{i+1}) become the initial value for the next value of i??

Hm… there is perhaps a third way that looks like what you suggest:

  • Make the input an unknown parameter, use both u_i and y_i as data, and use parameter fitting to find the value of the unknown input.
  • In this approach, it should be straightforward to find the unknown input parameter because it will be equal to the datapoints u_i??
  • I would still need to do some interpolation-like thing in the DAE model, to let the model know which of the input parameters to use.
  • I have lots of data; N_d is in the order of 1e4 – possibly sampled too often.

Anyway, is there an example around on model fitting with models influenced by inputs?

Only if they are implicitly defined. Otherwise it’s still an ODE. I guess the API on the parameter estimation is the issue here. You can define your own cost function to that uses explicit definitions from the ODE.

This shows up enough that we made GitHub - SciML/DataInterpolations.jl: A library of data interpolation and smoothing functions for this.

1 Like

I’ve fine-tuned my simulation code with experimental data, etc., and am ready to start parameter estimation…

Four questions:

  1. build_loss_objective – does it compute a single shooting cost function? [If so, why not give it the more descriptive name single_shooting_objective, in line with multiple_shooting_objective?)
  2. The build_loss_objective function — I assume that it takes the entire parameter object p of the DifferentialEquations problem as input argument to the resulting cost function, and returns a scalar number measuring the model fit??
  3. Is it required that the parameter object p of the DifferentialEquations object is a vector, or can it be any ordered collection such as a named tuple?
  4. Is it possible to specify a subset of parameter object p for the DifferentialEquations problem?

Legacy. This is a good breaking change to propose for DiffEq 7.0

It needs to be a vector and it needs to be the full vector. This is somewhat of an interface problem that we need to sort out with the optimization libraries. And yes the cost function reduces to a scalar.

Ah…

julia> p=(a=1,b=2)
(a = 1, b = 2)

julia> keys(p)
(:a, :b)

julia> values(p)
(1, 2)

Is there a way to “merge” these two tuples (keys(p) and values(p)) back into p?

OK… I figured it out:

julia> p
(a = 1, b = 2)

julia> k=keys(p)
(:a, :b)

julia> v=values(p)
(1, 2)

julia> NamedTuple{k}(p)
(a = 1, b = 2)

This should make it possible to use a named tuple p as argument to the DifferentialEquations model… something like:

julia> p = (a=1,b=2,c=3,d=4,e=5,f=6)
(a = 1, b = 2, c = 3, d = 4, e = 5, f = 6)

julia> k=keys(p)
(:a, :b, :c, :d, :e, :f)

julia> val=values(p)
(1, 2, 3, 4, 5, 6)

julia> vec=collect(val)
6-element Array{Int64,1}:
 1
 2
 3
 4
 5
 6

julia> idx = [1,3];

julia> vec[idx]
2-element Array{Int64,1}:
 1
 3

julia> vec[idx] = [3,1]
2-element Array{Int64,1}:
 3
 1

julia> vec
6-element Array{Int64,1}:
 3
 2
 1
 4
 5
 6

julia> NamedTuple{k}(Tuple(vec))
(a = 3, b = 2, c = 1, d = 4, e = 5, f = 6)

I don’t know how efficient this will be, but it will allow me to use named tuples as the parameter object in DifferentialEquations, specify the indices of the subset of the parameters that I want to change in the model, convert these to a vector, use this vector as input argument to the cost function for the optimization algorithm, read the improved parameter from the optimization code, convert these back to a named tuple for the DifferentialEquations model, etc., etc.

OK – my parameter estimation works, but my loss() function causes lots of memory usage… I’m looking for tips to reduce the memory consumption/reduce garbage collection.

To recapture/expand:

  1. My model of a stratified water tank is a PDE, which I have semi discretized into nT ODEs. I want to be able to play around with the size of nT, and I need to pose the initial values T_i(0) as unknown parameters — hence changing nt causes a change in the size of the parameter vector.
  2. I have two response observations: the temperature in the upper part of the tank (index i in T_i varies with nT) + an observation outside of the tank where effluent water from the tank is mixed with influent water. This second observation is an algebraic mixture of an input variable and a state. I can handle this in two ways:
    a. Pose the system as a DAE, or
    b. Pose the system as an ODE and write my own loss function.
    I’ve done the latter (b).
  3. Stupid or not, I have transferred physical parameters to the model in a NamedTuple named p. I want the set-up to be flexible. So I have set up the code so that I can choose an index vector pidx to specify which physical parameters in pthat I want to adjust/“train”.
  • This has some consequences: the parameters to tune need to be in a vector of Float64 numbers (at least in Optim). So there is some management to pick out values from the NamedTuple p, insert them in the parameter vector to tune pvec, etc.
  • The code could have been simplified if I instead had used physical parameters in a vector directly (instead of a NamedTuple). But then I (i) would have lost the possibility to use the field name in my model code (e.g., I use p.g for gravity — instead of the more obscure p[1]), (ii) I would have to do error prone changes in the model code every time I change the parameter set that I want to estimate. This is very much a trial and error thing, so flexibility is important.
  1. Experimental input data are transferred via a NamedTuple f of linear interpolation functions.

OK – here is my code — which is highly inefficient (I think…):

function cost(pvec,p,pidx,f,wt,T0,tspan,dt,ks1,data)
# pvec:  the Float64 vector of unknown parameters
# p:       the NamedTuple of physical parameters
# pidx:   a vector of Int with indices of the parameters of p that I want to tune
# f:        a NamedTuple of linear interpolation functions which specify time functions of experimental input data
# wt:     the name of my water tank model; needed to create the simulation problem
# T0:    initial guess of temperatures in model. These are inserted into `pvec` and are adjusted
# tspan: simulation time span -- needed to set up the simulation problem
# dt:     sample time in the experimental data
# kS1:  index of the computed $T_i$ where the sensor of the first observation is located
# data: matrix of data from observations -- 2 rows; one state (index kS1) and one algebraic output

    nT = length(T0)       # Gives number of initial states that taken from `pvec` and used as initial values in the simulation problem. The other elements of `pvec` are inserted in the NamedTuple `p_k` --- the update of `p`. 
    p_key = keys(p)
    p_val = collect(values(p))
    p_val[pidx] .= pvec[nT+1:end]
    p_k = NamedTuple{p_key}(Tuple(p_val)) # I name the updated parameter NamedTuple by `p_k` to avoid overwriting the initial guess I have
    # 
    prob = ODEProblem(wt,pvec[1:nT],tspan,p_k)   # Simulation/shooting problem, with updated initial states
    sol = solve(prob, saveat=dt)
    Sol = reduce(hcat,sol.u)
    Tm_i_L = f.T_i.(sol.t).*(1 .-f.u_v.(sol.t)) .+ Sol[end,:].*f.u_v.(sol.t)   # Algebraic variable; mixture of input temperature, output temperature, weighted by some valve opening
    sim = [Sol[ks1,:]';Tm_i_L']   # picking out the simulation/shooting solutions that I want to compare with the experimental data
    N = length(sim)
    #
    return norm(sim-data,2)^2/N
    #
end

with loss function for the Optim code given as:

loss = (pvec) -> cost(pvec,p,pidx,f_r,wt_r,T0,tspan_o,dt_r,kS1o,data_r)

Optimization is done with:

@time res = optimize(loss,p_lo,p_hi,pvec,Fminbox())

…which takes 8 minutes on my desktop workstation with 32 GB RAM, and some estimated memory use of 74 GiB.

Anyway, any tips for improvements are appreciated! I know it is a pain to read the code of others, so please bear over with me…

There are loads of variables that are introduced every time the cost function is called, so I understand that the code is not good… I have thought of passing these as modifiable input arguments to the cost function, but…
A. Are all variables mutable? E.g., prob, sol, etc?
B. Is it possible to “update”, e.g., prob in an efficient way?
C. Etc.

Any suggestions are appreciated. I know that I can rewrite the problem into an DAE and use the standard set-up, but that will have a dramatic cost wrt. flexibility — I think.

When you profile, are the array allocations a large chunk of the time? Finding out what to optimize is sometimes more important than finding out how to optimize. If a profile shows most of the time is spent in solve, then the efficiency of the rest of the cost function doesn’t matter.

2 Likes

Thanks for tip. I’ve never done profiling before… should I use the Profile package? Is the ProfileView package useful, or is my problem too simple for that.

I used Fminbox() from the Optim package. I observed (see comments under the Optimization category) that Fminbox() seems to miss somewhat with the solution in the scalar case.

Do you recommend another solver?

  • Another solver from Optim, or
  • A specific solver from Jump?
  • Or another solver?

This describes how to profile with Juno: macOS Python with numpy faster than Julia in training neural network - Stack Overflow . And if you’re not using Juno, ProfileView.jl will give you a similar GUI.

2 Likes

Here is the result from ProfileView in IJulia. From the user description at ProfileView.jl, I thought that I could hover the mouse over the various areas and get some more insight, but hovering doesn’t help… (Btw: I needed to increase the resolution from the default 10^6…).

How can I use this profile view?

NOTE: when using ProfileView, I’m told that:

julia> using ProfileView
[ Info: Precompiling Gtk [4c0ca9eb-093a-5379-98c5-f87ac0bbbf44]
ERROR: LoadError: Cairo not properly installed. Please run
Pkg.build("Cairo")

If I run Pkg.build("Cairo"), I get error messages:

julia> Pkg.build("Cairo")
  Building LibCURL ─→ `C:\Users\user_name\.julia\packages\LibCURL\teHPG\deps\build.log`
  Building WinRPM ──→ `C:\Users\user_name\.julia\packages\WinRPM\Y9QdZ\deps\build.log`
  Building Homebrew → `C:\Users\user_name\.julia\packages\Homebrew\s09IX\deps\build.log`
  Building Cairo ───→ `C:\Users\user_name\.julia\packages\Cairo\p68X8\deps\build.log`
┌ Error: Error building `Cairo`:
│ [ Info: Updating WinRPM package list
│ [ Info: Downloading https://cache.julialang.org/http://download.opensuse.org/repositories/windows:/mingw:/win32/openSUSE_Leap_42.2/repodata/repomd.xml
│ [ Info: Downloading https://cache.julialang.org/http://download.opensuse.org/repositories/windows:/mingw:/win64/openSUSE_Leap_42.2/repodata/repomd.xml
│ [ Info: Nothing to do
│ [ Info: Nothing to do
│ ERROR: LoadError: Provider WinRPM.RPM failed to satisfy dependency pango
│ Stacktrace:
│  [1] error(::String) at .\error.jl:33
│  [2] satisfy!(::BinDeps.LibraryDependency, ::Array{DataType,1}) at C:\Users\user_name\.julia\packages\BinDeps\ZEval\src\dependencies.jl:945
│  [3] satisfy!(::BinDeps.LibraryGroup, ::Array{DataType,1}) at C:\Users\user_name\.julia\packages\BinDeps\ZEval\src\dependencies.jl:910
│  [4] satisfy!(::BinDeps.LibraryGroup) at C:\Users\user_name\.julia\packages\BinDeps\ZEval\src\dependencies.jl:874
│  [5] top-level scope at C:\Users\user_name\.julia\packages\BinDeps\ZEval\src\dependencies.jl:977
│  [6] include at .\boot.jl:326 [inlined]
│  [7] include_relative(::Module, ::String) at .\loading.jl:1038
│  [8] include(::Module, ::String) at .\sysimg.jl:29
│  [9] include(::String) at .\client.jl:403
│  [10] top-level scope at none:0
│ in expression starting at C:\Users\user_name\.julia\packages\Cairo\p68X8\deps\build.jl:176
└ @ Pkg.Operations C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Pkg\src\Operations.jl:1075

I don’t know if these error messages are related to those at Problem with Cairo building - #10 by tobias.knopp (which seems to be for a Linux installation). The ProfileView version is [c46f51b8] ProfileView v0.4.1.

Anyway, I’ll transfer the commands from IJulia to Juno, and try without using ProfileView.

The line numbers are really the thing to be seen. Is all time spent in the ODE solve?

I redid the simulations in Juno; Juno.profiler() worked without the Cairo problems of ProfileView:

When I hover the mouse over the various areas in the profile plot, the areas encircled in blue appear to relate to Optim, while those encircled in red appear to be related to DifferentialEquations. More or less.

What do I learn from this?

Wow, that is a very deep stack. Did you do an analysis to check what lines/functions the time is generally spent it? This graphic omits all of the line information.

I didn’t see how I could read the line number in Juno from the profile. I’ll check again after the weekend. I need to give priority to some application deadline + a midsummer hike in the mountains (in the rain).