Challenges in Estimating Wind Turbine Controller Using Universal Differential Equations (UDE) Approach

Hello all,

As part of my PhD work, I’m applying the Universal Differential Equations (UDE) approach seen in Automatically Discover Missing Physics by Embedding Machine Learning into Differential Equations · Overview of Julia's SciML to estimate the controller of a wind turbine. I’m using a simple aerodynamic model with three states (rotor speed wr, generator torque Tg and blade pitch β) and one exogenous input (the wind speed vr). I lack knowledge of the controllers responsible for computing the reference torque Tg_ref and reference pitch β_ref. They might function as PI controllers, but I’d rather not presume any specific structure to maintain generality, preferring to keep them as a neural network. The system has been simulated previously in open loop (using Tg and blade pitch β as exogenous inputs apart from vr) and it produces quite accurate results compared to the data, so the issue I describe below is not due to a wrong aerodynamic model.

My issue: I’m unable to get a remotely decent neural network estimation of the controller, which I find surprising, as it normally has a form similar to this: β_ref = Kp_β * wr_err + Ki_β * ∫wr_err and Tg_ref = Kp_Tg * wr_err + Tg_rated, with wr_err = wr – wr_rated. A neural network should have no problem approximating such a simple relationship. Depending on the random initialization of the neural network parameters I’ve noticed various outcomes, none of which are successful: either my loss remains unchanged throughout the iterations of Adam, or the problem becomes unstable, preventing the optimization from completing, or if it does complete, the estimated trajectory is unsatisfactory.

Note 1: I’ve scaled the data dividing by the maximum possible value of each variable. Inside the wind_turbine_ude!() function, the error and the controller are supposed to be scaled such that the neural networks only have to deal with values between 0 and 1. Then, I de-scale wr, Tg and β by multiplying by their maximum values because the rest of the model needs the variables in the original units, and then I make sure that the derivatives are scaled again in the last three lines of the function.

Note 2: Defining f_β = x -> max(min(1.0, x), 0.0) and f_Tg = x -> max(min(1.0, x), 0.0) and then using them as Tg_ref = NN_Tg([wr_err], p.p_NN_β2, _st_NN_β)[1][1] and β_ref = NN_β([wr_err], p.p_NN_Tg2, _st_NN_Tg)[1][1] seems to help in avoiding instability, but still the outcome is wrong as I described earlier.


  • How can I set up the initial parameters of the neural network to avoid the untrained network producing weird initial trajectories? Just to try to make the problem easier for the optimizer.
  • I’ve also tried defining only one neural network for both control signals, but I get similar outcomes. Nevertheless, which option would be more advisable, using one neural network or using two as in the code attached below?
  • Any idea of what else could be harming the estimation problem? Any suggestions or small improvements are greatly appreciated.

My code can be seen hereunder:

# SciML Tools
using OrdinaryDiffEq, ModelingToolkit, DataDrivenDiffEq, SciMLSensitivity, DataDrivenSparse
using Optimization, OptimizationOptimisers, OptimizationOptimJL, OptimizationPolyalgorithms

# Standard Libraries
using LinearAlgebra, Statistics

# External Libraries
using ComponentArrays, Lux, Zygote, Plots, StableRNGs

# Set a random seed for reproducible behaviour
rng = StableRNG(1111)

using XLSX

# Load data from Excel sheet
filename = "...\\data_14ms_4000.xlsx" 
data = XLSX.readxlsx(filename)["Ark1"]  # Assuming your data is in Sheet1 (Ark1)

wr_data = data["A"][1:2000]  # Assuming column A contains angular speed data (rad/s)
Tg_data = data["B"][1:2000]  # Assuming column B contains generator torque data (Nm)
#β = data["C"]  # Assuming column C contains pitch angle data (rad)
β_data = data["F"][1:2000]  # Assuming column F contains pitch angle data (degrees)
vr_data = data["D"][1:2000]  # Assuming column D contains wind speed data (m/s)

wr_data = map(Float64, wr_data)
Tg_data = map(Float64, Tg_data)
β_data = map(Float64, β_data)
vr_data = map(Float64, vr_data)

tspan = (30.0, 129.95) # end = 629.95
t_data = range(tspan[1], tspan[2], length=2000)  # Time points

# Interpolate exogenous input
using DataInterpolations
vr_interp = BSplineInterpolation(vr_data[1:end], t_data[1:end], 2, :ArcLen, :Average)

# Define UDE

rbf(x) = exp.(-(x .^ 2))

# Multilayer FeedForward
const NN_β = Lux.Chain(Lux.Dense(1, 5, rbf), Lux.Dense(5, 5, rbf), Lux.Dense(5, 5, rbf),
              Lux.Dense(5, 1))
# Randomly initialize the initial parameters and state variables of the model
p_NN_β, st_NN_β = Lux.setup(rng, NN_β)
p_NN_β2 = ComponentArray(p_NN_β)
const _st_NN_β = st_NN_β

# Multilayer FeedForward
const NN_Tg = Lux.Chain(Lux.Dense(1, 5, rbf), Lux.Dense(5, 5, rbf), Lux.Dense(5, 5, rbf),
              Lux.Dense(5, 1))
# Randomly initialize the initial parameters and state variables of the model
p_NN_Tg, st_NN_Tg = Lux.setup(rng, NN_Tg)
p_NN_Tg2 = ComponentArray(p_NN_Tg)
const _st_NN_Tg = st_NN_Tg

p_all = ComponentArray(; p_NN_β2, p_NN_Tg2)

# To avoid values outside [0,1] range
f_β = x -> max(min(1.0, x), 0.0)
f_Tg = x -> max(min(1.0, x), 0.0)

# Scaling
wr_max = 1.0472
Tg_max = 2.159e7
β_max = 90

wr_data_scaled = wr_data ./ wr_max
Tg_data_scaled = Tg_data ./ Tg_max
β_data_scaled = β_data ./ β_max

# Define model constants
μd = 0.05
Rr = 120.998
ρ = 1.225
Ar = π * Rr^2
Jr = 321699000
Jg = 3.777e6
c1 = 0.1661
c2 = 115.9810
c3 = 1.4138
c4 = 5.3663
c5 = 16.2881
c6 = 0.0410
tau_Tg = 0.05
tau_β = 0.05
wr_rated = 0.7917 / wr_max

# Define hybrid wind turbine model with wind speed as exogenous input
function wind_turbine_ude!(du, u, p, t)
    # States
    wr = u[1]
    Tg = u[2]
    β = u[3]

    # Error
    wr_err = wr - wr_rated

    # Controller
    Tg_ref = NN_Tg([wr_err], p.p_NN_β2, _st_NN_β)[1][1]
    β_ref = NN_β([wr_err], p.p_NN_Tg2, _st_NN_Tg)[1][1]
    # Calculate tip-speed ratio λ
    λ = (wr*wr_max) * Rr / vr_interp(t)
     # Calculate 1/λi (numerical approx. for Cp(θ, λ))
    λi_inv = 1 / (λ + 0.08 * (β*β_max)) - 0.035 / ((β*β_max)^3 + 1)
    # Calculate Cp (numerical approx. for Cp(θ, λ))
    Cp = max(0, c1 * (c2 * λi_inv - c3 * (β*β_max) - c4) * exp(-c5 * λi_inv) + c6 * λ)

    # Calculate aerodynamic torque
    Tr = 1 / (2 * (wr*wr_max)) * ρ * Ar * vr_interp(t)^3 * Cp # max(0, Cp)
    # State derivatives
    du[1] = ((1 - μd) * Tr - (Tg*Tg_max)) / (Jr + Jg) / wr_max
    du[2] = ((Tg_ref*Tg_max) - (Tg*Tg_max)) / tau_Tg / Tg_max
    du[3] = ((β_ref*β_max) - (β*β_max)) / tau_β / β_max


# Define initial-value problem.
wr0 = wr_data_scaled[1]
Tg0 = Tg_data_scaled[1]
vr0 = vr_data[1]
β0 = β_data_scaled[1]

# Define the problem
prob_UDE = ODEProblem(wind_turbine_ude!, [wr0, Tg0, β0], tspan, p_all)

function predict(θ, X = [wr0, Tg0, β0], T = t_data)
    _prob = remake(prob_UDE, u0 = X, tspan = (T[1], T[end]), p = θ)
    Array(solve(_prob, AutoTsit5(Rosenbrock23()), saveat = T, 
                abstol = 1e-6, reltol = 1e-6,

# Transpose data
wr_data_scaled2 = reshape(wr_data_scaled, 1, :)
Tg_data_scaled2 = reshape(Tg_data_scaled, 1, :)
β_data_scaled2 = reshape(β_data_scaled, 1, :)
all_data_scaled = vcat(wr_data_scaled2, Tg_data_scaled2, β_data_scaled2)

function loss(θ)
    X̂ = predict(θ)
    mean(abs2, all_data_scaled .- X̂)

losses = Float64[]

callback = function (p, l)
    push!(losses, l)
    if length(losses) % 10 == 0
        println("Current loss after $(length(losses)) iterations: $(losses[end])")
    return false

adtype = Optimization.AutoZygote()
optf = Optimization.OptimizationFunction((x, p) -> loss(x), adtype)
optprob = Optimization.OptimizationProblem(optf, ComponentVector{Float64}(p_all)) 

res1 = Optimization.solve(optprob, ADAM(0.1), callback = callback, maxiters = 1000)
println("Training loss after $(length(losses)) iterations: $(losses[end])")

Excel file with data for reproducibility:

You say that the controller for β_ref is a PI controller, but your neural network controller is a static function, i.e., it has no integrator. Unless you include some for of state variable in the controller, you will not be able to represent an integrator. The problematic term is ∫wr_err, a feedforward neural network cannot represent this.

Try adding

du[4] = wr_err

and add the fourth state variable to the controller input as well.

Thank you so much, that solves the issue. See below the trajectory estimation of the 1st state (wr) of the UDE in comparison to the data.

One question, how could I initialize the neural network parameters such that it outputs an certain initial value specified by me? In order to produce the result shown above, I had to run the code multiple times until, finally, I got by chance a set of initial parameters of the neural network that did not output unstable trajectories.

I’m not sure what you want to accomplish, if you want the output y = f(u) at time 0 to be y(0) = y_0, you can just see what f(u(0)) returns, and perform the transformation

y(t) = f(u(t)) - f(u(0)) + y_0

You seem to conflate stability of a dynamical system with initial condition?

1 Like

I mean something like this:

Sometimes the training will not finish if the initialization of the weights of the two neural networks makes the system unstable. I want to initialize the weights such that the untrained neural networks produce a value of 1 for the two control variables, for example. I’ve read the responses in that thread but it is still not clear to me how to solve this.