Hello all,

As part of my PhD work, I’m applying the Universal Differential Equations (UDE) approach seen in Automatically Discover Missing Physics by Embedding Machine Learning into Differential Equations · Overview of Julia's SciML to estimate the controller of a wind turbine. I’m using a simple aerodynamic model with three states (rotor speed `wr`

, generator torque `Tg`

and blade pitch `β`

) and one exogenous input (the wind speed `vr`

). I lack knowledge of the controllers responsible for computing the reference torque `Tg_ref`

and reference pitch `β_ref`

. They might function as PI controllers, but I’d rather not presume any specific structure to maintain generality, preferring to keep them as a neural network. The system has been simulated previously in open loop (using `Tg`

and blade pitch `β`

as exogenous inputs apart from `vr`

) and it produces quite accurate results compared to the data, so the issue I describe below is not due to a wrong aerodynamic model.

**My issue:** I’m unable to get a remotely decent neural network estimation of the controller, which I find surprising, as it normally has a form similar to this: ` β_ref = Kp_β * wr_err + Ki_β * ∫wr_err`

and `Tg_ref = Kp_Tg * wr_err + Tg_rated`

, with `wr_err = wr – wr_rated`

. A neural network should have no problem approximating such a simple relationship. Depending on the random initialization of the neural network parameters I’ve noticed various outcomes, none of which are successful: either my loss remains unchanged throughout the iterations of Adam, or the problem becomes unstable, preventing the optimization from completing, or if it does complete, the estimated trajectory is unsatisfactory.

**Note 1:** I’ve scaled the data dividing by the maximum possible value of each variable. Inside the `wind_turbine_ude!()`

function, the error and the controller are supposed to be scaled such that the neural networks only have to deal with values between 0 and 1. Then, I de-scale `wr`

, `Tg`

and `β`

by multiplying by their maximum values because the rest of the model needs the variables in the original units, and then I make sure that the derivatives are scaled again in the last three lines of the function.

**Note 2:** Defining `f_β = x -> max(min(1.0, x), 0.0)`

and `f_Tg = x -> max(min(1.0, x), 0.0)`

and then using them as `Tg_ref = NN_Tg([wr_err], p.p_NN_β2, _st_NN_β)[1][1]`

and `β_ref = NN_β([wr_err], p.p_NN_Tg2, _st_NN_Tg)[1][1]`

seems to help in avoiding instability, but still the outcome is wrong as I described earlier.

**Questions:**

- How can I set up the initial parameters of the neural network to avoid the untrained network producing weird initial trajectories? Just to try to make the problem easier for the optimizer.
- I’ve also tried defining only one neural network for both control signals, but I get similar outcomes. Nevertheless, which option would be more advisable, using one neural network or using two as in the code attached below?
- Any idea of what else could be harming the estimation problem? Any suggestions or small improvements are greatly appreciated.

My code can be seen hereunder:

```
# SciML Tools
using OrdinaryDiffEq, ModelingToolkit, DataDrivenDiffEq, SciMLSensitivity, DataDrivenSparse
using Optimization, OptimizationOptimisers, OptimizationOptimJL, OptimizationPolyalgorithms
# Standard Libraries
using LinearAlgebra, Statistics
# External Libraries
using ComponentArrays, Lux, Zygote, Plots, StableRNGs
gr()
# Set a random seed for reproducible behaviour
rng = StableRNG(1111)
using XLSX
# Load data from Excel sheet
filename = "...\\data_14ms_4000.xlsx"
data = XLSX.readxlsx(filename)["Ark1"] # Assuming your data is in Sheet1 (Ark1)
wr_data = data["A"][1:2000] # Assuming column A contains angular speed data (rad/s)
Tg_data = data["B"][1:2000] # Assuming column B contains generator torque data (Nm)
#β = data["C"] # Assuming column C contains pitch angle data (rad)
β_data = data["F"][1:2000] # Assuming column F contains pitch angle data (degrees)
vr_data = data["D"][1:2000] # Assuming column D contains wind speed data (m/s)
wr_data = map(Float64, wr_data)
Tg_data = map(Float64, Tg_data)
β_data = map(Float64, β_data)
vr_data = map(Float64, vr_data)
tspan = (30.0, 129.95) # end = 629.95
t_data = range(tspan[1], tspan[2], length=2000) # Time points
# Interpolate exogenous input
using DataInterpolations
vr_interp = BSplineInterpolation(vr_data[1:end], t_data[1:end], 2, :ArcLen, :Average)
# Define UDE
rbf(x) = exp.(-(x .^ 2))
# Multilayer FeedForward
const NN_β = Lux.Chain(Lux.Dense(1, 5, rbf), Lux.Dense(5, 5, rbf), Lux.Dense(5, 5, rbf),
Lux.Dense(5, 1))
# Randomly initialize the initial parameters and state variables of the model
p_NN_β, st_NN_β = Lux.setup(rng, NN_β)
p_NN_β2 = ComponentArray(p_NN_β)
const _st_NN_β = st_NN_β
# Multilayer FeedForward
const NN_Tg = Lux.Chain(Lux.Dense(1, 5, rbf), Lux.Dense(5, 5, rbf), Lux.Dense(5, 5, rbf),
Lux.Dense(5, 1))
# Randomly initialize the initial parameters and state variables of the model
p_NN_Tg, st_NN_Tg = Lux.setup(rng, NN_Tg)
p_NN_Tg2 = ComponentArray(p_NN_Tg)
const _st_NN_Tg = st_NN_Tg
p_all = ComponentArray(; p_NN_β2, p_NN_Tg2)
# To avoid values outside [0,1] range
f_β = x -> max(min(1.0, x), 0.0)
f_Tg = x -> max(min(1.0, x), 0.0)
# Scaling
wr_max = 1.0472
Tg_max = 2.159e7
β_max = 90
wr_data_scaled = wr_data ./ wr_max
Tg_data_scaled = Tg_data ./ Tg_max
β_data_scaled = β_data ./ β_max
# Define model constants
μd = 0.05
Rr = 120.998
ρ = 1.225
Ar = π * Rr^2
Jr = 321699000
Jg = 3.777e6
c1 = 0.1661
c2 = 115.9810
c3 = 1.4138
c4 = 5.3663
c5 = 16.2881
c6 = 0.0410
tau_Tg = 0.05
tau_β = 0.05
wr_rated = 0.7917 / wr_max
# Define hybrid wind turbine model with wind speed as exogenous input
function wind_turbine_ude!(du, u, p, t)
# States
wr = u[1]
Tg = u[2]
β = u[3]
# Error
wr_err = wr - wr_rated
# Controller
Tg_ref = NN_Tg([wr_err], p.p_NN_β2, _st_NN_β)[1][1]
β_ref = NN_β([wr_err], p.p_NN_Tg2, _st_NN_Tg)[1][1]
# Calculate tip-speed ratio λ
λ = (wr*wr_max) * Rr / vr_interp(t)
# Calculate 1/λi (numerical approx. for Cp(θ, λ))
λi_inv = 1 / (λ + 0.08 * (β*β_max)) - 0.035 / ((β*β_max)^3 + 1)
# Calculate Cp (numerical approx. for Cp(θ, λ))
Cp = max(0, c1 * (c2 * λi_inv - c3 * (β*β_max) - c4) * exp(-c5 * λi_inv) + c6 * λ)
# Calculate aerodynamic torque
Tr = 1 / (2 * (wr*wr_max)) * ρ * Ar * vr_interp(t)^3 * Cp # max(0, Cp)
# State derivatives
du[1] = ((1 - μd) * Tr - (Tg*Tg_max)) / (Jr + Jg) / wr_max
du[2] = ((Tg_ref*Tg_max) - (Tg*Tg_max)) / tau_Tg / Tg_max
du[3] = ((β_ref*β_max) - (β*β_max)) / tau_β / β_max
end
# Define initial-value problem.
wr0 = wr_data_scaled[1]
Tg0 = Tg_data_scaled[1]
vr0 = vr_data[1]
β0 = β_data_scaled[1]
# Define the problem
prob_UDE = ODEProblem(wind_turbine_ude!, [wr0, Tg0, β0], tspan, p_all)
function predict(θ, X = [wr0, Tg0, β0], T = t_data)
_prob = remake(prob_UDE, u0 = X, tspan = (T[1], T[end]), p = θ)
Array(solve(_prob, AutoTsit5(Rosenbrock23()), saveat = T,
abstol = 1e-6, reltol = 1e-6,
sensealg=ForwardDiffSensitivity()))
end
# Transpose data
wr_data_scaled2 = reshape(wr_data_scaled, 1, :)
Tg_data_scaled2 = reshape(Tg_data_scaled, 1, :)
β_data_scaled2 = reshape(β_data_scaled, 1, :)
all_data_scaled = vcat(wr_data_scaled2, Tg_data_scaled2, β_data_scaled2)
function loss(θ)
X̂ = predict(θ)
mean(abs2, all_data_scaled .- X̂)
end
losses = Float64[]
callback = function (p, l)
push!(losses, l)
if length(losses) % 10 == 0
println("Current loss after $(length(losses)) iterations: $(losses[end])")
end
return false
end
adtype = Optimization.AutoZygote()
optf = Optimization.OptimizationFunction((x, p) -> loss(x), adtype)
optprob = Optimization.OptimizationProblem(optf, ComponentVector{Float64}(p_all))
res1 = Optimization.solve(optprob, ADAM(0.1), callback = callback, maxiters = 1000)
println("Training loss after $(length(losses)) iterations: $(losses[end])")
```