SciML: optimizing a control signal

Suppose I’ve got a differential equation that takes some forcing / control signal as input defined as a DataInterpolation.LinearInterpolation(u, t) (similar to the tutorial example). I’d like to find the u vector which minimizes some loss function. How should I do that with ModelingToolkit? I’m fine with the OptimizationProblem part of the solution, but I don’t know how to handle the u control signal.

  • I can pass the individual u1, u2, u3 as parameters, then call LinearInterpolation([u1,u2,u3], t). That doesn’t look very practical, but maybe I can create many parameters in a comprehension?
  • I believe that I can pass non-numerical values (like Vector, or LinearInterpolation objects) as parameters, by specifying their type somehow. I can’t find the documentation anymore, but is that how it’s done?
  • I guess I could look into DiffEqFlux.jl?

Solving ODE parameters using experimental data with control inputs - #20 by zdenek_hurak is related but doesn’t really have an MTK answer.

This needs an issue. We currently don’t have a good MTK interface on it, unless @baggepinnen has a controls functionality specifically for it.

1 Like

The problem you describe seems to be a standard optimal control problem. See ModelPredictiveControl.jl for an existing package.

If you want to write it yourself for a continuous time model, a straightforward and simple way to do it is to use a vector of unknowns [u_0, u_1, ..., u_N-1] changing at times [t_1, t_1, ..., t_N]. If you want to use ModelingToolkit, I’d guess the simplest way is to use the DataInterpolations.jl package (note: in your text, you miss an s at the end of the package name). In Model Predictive Control, it is quite common to assume “zero order hold” control input, which means you should use the ConstantInterpolation function, not the LinearInterpolation function. But there is now general rule – perhaps you get better results if you use other forms, e.g., spline interpolation?

So a straightforward procedure would be:

  • Define the optimization variables U_j = [u_1^j, ..., u_{N-1}^j] with changes at T=[t_1, ..., t_N] and let U_0 be your initial guess.
  • Construct the interpolation function, say, ConstantInterpolation(U_0, T)
  • Pass the interpolation function to your MTK model.
  • Construct a cost function, which will depend on U_j
  • Use an optimization algorithm to change U_j until optimum is reached.

Things to think about:

  • How efficient is it to construct a new interpolation function each time step? I don’t know, but hopefully relatively efficient.
  • The procedure sketched above uses “single shooting”, i.e., solves the prediction model forward in one go. This may not be a good idea if the model is “unstable”. For stable systems, it should work relatively well, I’d think.
  • If the system is more problematic, you may use “multiple shooting”: divide the future horizon into sub-horizon, and make guesses for the initial values at the start of each sub-horizon. Then use MTK to solve the model over each short sub-horizon with the “guessed” intial values for each sub-horizon. With a short horizon, there is much less chance of something going wrong wrt. instability. However, most likely, the solution at the end of one sub-horizon will not match the guessed initial value of the subsequent sub-horizon, so you need to iterate on the guesses of initial values for each sub-horizon until you achieve a continuous solution.
  • The best method may be Biegler et al.'s idea of simultaneous collocation and optimization, where some collocation approximation is introduced for the model prediction in combination with the discretized input U_j.

You would need to register the interpolation input function for use in MTK. It is possible that you also would need to define the derivative of the interpolation function if you want to use an optimization code that takes into account the gradient of the cost function – I don’t know how that could be done.

1 Like

That’s the part I don’t understand. How can I pass a function to an MTK model? With

@mtkmodel FOLExternalFunction begin
	@parameters begin
		f
	end
    @variables begin
        x(t) # dependent variables
    end
    @equations begin
        x ~ f(t)
    end
end

@mtkbuild fol2 = FOLExternalFunction() errors with Sym f is not callable. Use @syms f(var1, var2,...) to create it as a callable. and then I’m lost on how to use that.

I found:

SampledData Component · ModelingToolkitStandardLibrary.jl (sciml.ai) has this code:

function get_prob(data)
    defs[s.src.buffer] = Parameter(data, dt)
    # ensure p is a uniform type of Vector{Parameter{Float64}} (converting from Vector{Any})
    p = Parameter.(ModelingToolkit.varmap_to_vars(defs, parameters(sys); tofloat = false))
    remake(prob; p)
end

to update a SampledData object. On the plus side, it looks like one can differentiate through remake. But it looks like it’s limited (cannot do linear interpolation?) and the interface is a bit rough.

Here is what I meant with “passing” the input to the model… I use a simple model of a body following Newton’s law (with friction) as an example:

# Packages
using ModelingToolkit
using ModelingToolkit: t_nounits as t, D_nounits as Dt
using DataInterpolations
using DifferentialEquations: solve
using Plots
# Some constants
LW1 = 2.5
LC1 = :red 
LC2 = :blue 
LC3 = :green
LS1 = :solid 
LS2 = :dashdot
# Registering the control input function
@register_symbolic u(t)
# Model instantiator
@mtkmodel Body begin
    # Model parameters
    @parameters begin 
        k=0.01,    [description = "Linear friction coefficient, 1/s"]
    end
    # Model variables, with initial values needed
    @variables begin
        x(t)=1,     [description = "Position, m"]
        v(t)=0,     [description = "Velocity, m/s"]
        # u(t),       scaled force, defined in the global name space
    end
    # Providing model equations
    @equations begin
        Dt(x) ~ v
        Dt(v) ~ u(t) - k*v
    end
end

# Instantiating a model named "body"
@mtkbuild body = Body()

With the last statement, Julia should respond with a typesetting of the model:
image

Continuing with defining the time points for control change (N control values) and setting the control values to zero initially:

t_final = 10
N = 3
# Linear spread in time points
T = collect(range(0, t_final, length=N+1))
# Initial input
U = zeros(N+1)
U[end] = U[end-1]  # because of zero order hold

We can now create a numeric problem from the above symbolic model:

tspan = (T[1], T[end])
#
prob = ODEProblem(body, [], tspan)

Next, we create an interpolation function u_c based on U and T, and set the control input u equal to the interpolation function.

u_c = ConstantInterpolation(U, T)
u(t) = u_c(t)

Finally, let us solve the numeric problem + plot the states and the input function:

sol = solve(prob; tstops = T)
#
fg_state = plot(sol, lw=LW1, lc=[LC1 LC2], ls=LS1)
fg_input = plot(T, U; st=:steppost, lw=LW1, lc=LC3)

So far, so good.
In an optimization sequence, we need to change the control inputs. This essentially implies changing the elements of vector U. Just to illustrate how to do it:

U = 0.1*randn(N+1)
U[end] = U[end-1]
u_c = ConstantInterpolation(U,T)

Because we have changed u_c, that means that the registered function u automatically also is changed, and this also changes the numeric problem – it is not necessary to change the creation of the numeric problem – we simply solve the numeric problem again:

sol = solve(prob; tstops = T)
#
plot!(fg_state, sol, lw=LW1, lc=[LC1 LC2], ls=LS2)

leading to:


while

plot!(fg_input, T, U; st=:steppost, lw=LW1, lc=LC3, ls=LS2)

gives:

Some final comments

  1. Here, U[1:N] are the control variables, i.e., the unknowns in the optimization problem.
  2. For the optimization problem, you need to construct a cost function. The cost function takes as input U[1:N].
  • Within the cost function, you take U[1:N] and create the interpolation function u_c (in my code above – choose another name if you want to).
  • Then you solve the model over the interval tspan.
  • Then you construct your choice of cost, typically based on the solution of the model and the value of the control input (the “scaled force”). As an example, the cost could be the sum of squared deviations between the position and some reference value + the sum of squared force used.
  • With input U[1:N], the cost function needs to return the cost.
  1. You then choose some optimization code, select an initial guess of U [e.g., a zero vector], and pass the initial guess + the cost function to the optimization solver.
  2. With the above procedure, you can easily include constraints on the inputs u. It is not so straightforward to provide constraints on the states (e.g., the position).
  3. In my description above, it will probably not work to use an optimizer that is based on the gradient of the cost function. But it should work fine to use an optimization solver that does not use the gradient, e.g., BlackBoxOptim.
  4. If you want to use a gradient-based solver, you may need to add more information to the register_symbolic macro. I don’t know how to do that.
1 Like

Thank you for the detailed example! It’s basically Custom-Component-with-External-Data, but your version really helped me out.

  1. If you want to use a gradient-based solver, you may need to add more information to the register_symbolic macro. I don’t know how to do that.

I think your example will work even without @register_symbolic. Since the derivative of ConstantInterpolation is already defined in DataInterpolations.jl, it should “just work”.

My only qualm is around ODEProblem. I think it assumes that all numeric values are “up for mutation” (eg. it won’t look that a parameter is currently 0, and rearrange the formulas based on that)? If so, then I think this should work alright. Frequently Asked Questions · ModelingToolkit.jl (sciml.ai) also points in that direction.

Thanks again!

u_c = ConstantInterpolation(U, T)
u(t) = u_c(t)

I tried my best, but I think this won’t work for ForwardDiff. AFAICT ModelingToolkit sets up its arrays based on @parameters, and since none of them are Dual, the duals inside of ConstantInterpolation don’t propagate (no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(Main.var"workspace#18".loss), Float64}, Float64, 8}))

Maybe introducing a dummy Parameter that is Dual…