I want to model a quadrocpter with MTK. In the first step I want to identify the parameters of a simple model from real flight data. In the second step I want to learn not yet modeled effects from the data by different experiments with (Neural Networks, Universal Differential Equations, …). Finally, I want to use the model to optimize controller parameters. To do this, I want to differentiate from a trajectory error through the entire model to the controller parameters.

Should discrete inputs be available as variables or as parameters? (Under the aspect that I later need a gradient from the continuous model into the discrete controller.)

I currently use a varaible because I can have the data more conveniently in the output for logging. Is there also a solution with changing parameters?
Update parameters in ModelingToolkit using callbacks 
Is there a better solution to connect continuous with discrete than callbacks? (I have not yet tried to differentiate the structure automatically. To optimize controller parameters later.)
ModelingToolkit  DiscreteUpdate
Mixing timediscrete and timecontinuous systems in a simulation
Coupling MTK with time discrete control · Issue #1180 · SciML/ModelingToolkit.jl · GitHub
The code is shortened to show the relevant parts.
@variables (action)(t) = 0.0;
function condition(u, t, integrator)
# Points in time at which input data is available
t in data.timestamp
end
function affect!(integrator)
# searches for the next input
i = findfirst(t > t == integrator.t, data.timestamp)
integrator.u[1] = data.action[i]
end
cb = DiscreteCallback(condition, affect!, save_positions = (true, true))
# for logging
eqs = vcat(
Dt(action) ~ 0,
);
# Stops at the points in time at which new inputs are present.
sol = solve(prob, Tsit5(), callback = cb, tstops = data.timestamp);
Optimization
4. I have several flight trajectories. To identify the parameters, I want to apply the real actions (discrete) to the model (continuous) and calculate the trajectory deviation at the recorded discrete time points (as loss). I don’t want to do this for just one trajectory, but for all of them at the same time. My input in the affect!(integrator)
function is taken from a global data
source. Do I have to create a new ODEProblem(sys, u0, tspan, p)
every time or is there a better solution?
cost_function = build_loss_objective(prob, Tsit5(), callback = cb, tstops = data.timestamp, loss(data.timestamp, data.trajectory),
Optimization.AutoForwardDiff(),
maxiters = 10000, verbose = false);