Hello all,
As part of my PhD work, I’m applying the Universal Differential Equations (UDE) approach seen in Automatically Discover Missing Physics by Embedding Machine Learning into Differential Equations · Overview of Julia's SciML to estimate the controller of a wind turbine. I’m using a simple aerodynamic model with three states (rotor speed wr
, generator torque Tg
and blade pitch β
) and one exogenous input (the wind speed vr
). I lack knowledge of the controllers responsible for computing the reference torque Tg_ref
and reference pitch β_ref
. They might function as PI controllers, but I’d rather not presume any specific structure to maintain generality, preferring to keep them as a neural network. The system has been simulated previously in open loop (using Tg
and blade pitch β
as exogenous inputs apart from vr
) and it produces quite accurate results compared to the data, so the issue I describe below is not due to a wrong aerodynamic model.
My issue: I’m unable to get a remotely decent neural network estimation of the controller, which I find surprising, as it normally has a form similar to this: β_ref = Kp_β * wr_err + Ki_β * ∫wr_err
and Tg_ref = Kp_Tg * wr_err + Tg_rated
, with wr_err = wr – wr_rated
. A neural network should have no problem approximating such a simple relationship. Depending on the random initialization of the neural network parameters I’ve noticed various outcomes, none of which are successful: either my loss remains unchanged throughout the iterations of Adam, or the problem becomes unstable, preventing the optimization from completing, or if it does complete, the estimated trajectory is unsatisfactory.
Note 1: I’ve scaled the data dividing by the maximum possible value of each variable. Inside the wind_turbine_ude!()
function, the error and the controller are supposed to be scaled such that the neural networks only have to deal with values between 0 and 1. Then, I de-scale wr
, Tg
and β
by multiplying by their maximum values because the rest of the model needs the variables in the original units, and then I make sure that the derivatives are scaled again in the last three lines of the function.
Note 2: Defining f_β = x -> max(min(1.0, x), 0.0)
and f_Tg = x -> max(min(1.0, x), 0.0)
and then using them as Tg_ref = NN_Tg([wr_err], p.p_NN_β2, _st_NN_β)[1][1]
and β_ref = NN_β([wr_err], p.p_NN_Tg2, _st_NN_Tg)[1][1]
seems to help in avoiding instability, but still the outcome is wrong as I described earlier.
Questions:
- How can I set up the initial parameters of the neural network to avoid the untrained network producing weird initial trajectories? Just to try to make the problem easier for the optimizer.
- I’ve also tried defining only one neural network for both control signals, but I get similar outcomes. Nevertheless, which option would be more advisable, using one neural network or using two as in the code attached below?
- Any idea of what else could be harming the estimation problem? Any suggestions or small improvements are greatly appreciated.
My code can be seen hereunder:
# SciML Tools
using OrdinaryDiffEq, ModelingToolkit, DataDrivenDiffEq, SciMLSensitivity, DataDrivenSparse
using Optimization, OptimizationOptimisers, OptimizationOptimJL, OptimizationPolyalgorithms
# Standard Libraries
using LinearAlgebra, Statistics
# External Libraries
using ComponentArrays, Lux, Zygote, Plots, StableRNGs
gr()
# Set a random seed for reproducible behaviour
rng = StableRNG(1111)
using XLSX
# Load data from Excel sheet
filename = "...\\data_14ms_4000.xlsx"
data = XLSX.readxlsx(filename)["Ark1"] # Assuming your data is in Sheet1 (Ark1)
wr_data = data["A"][1:2000] # Assuming column A contains angular speed data (rad/s)
Tg_data = data["B"][1:2000] # Assuming column B contains generator torque data (Nm)
#β = data["C"] # Assuming column C contains pitch angle data (rad)
β_data = data["F"][1:2000] # Assuming column F contains pitch angle data (degrees)
vr_data = data["D"][1:2000] # Assuming column D contains wind speed data (m/s)
wr_data = map(Float64, wr_data)
Tg_data = map(Float64, Tg_data)
β_data = map(Float64, β_data)
vr_data = map(Float64, vr_data)
tspan = (30.0, 129.95) # end = 629.95
t_data = range(tspan[1], tspan[2], length=2000) # Time points
# Interpolate exogenous input
using DataInterpolations
vr_interp = BSplineInterpolation(vr_data[1:end], t_data[1:end], 2, :ArcLen, :Average)
# Define UDE
rbf(x) = exp.(-(x .^ 2))
# Multilayer FeedForward
const NN_β = Lux.Chain(Lux.Dense(1, 5, rbf), Lux.Dense(5, 5, rbf), Lux.Dense(5, 5, rbf),
Lux.Dense(5, 1))
# Randomly initialize the initial parameters and state variables of the model
p_NN_β, st_NN_β = Lux.setup(rng, NN_β)
p_NN_β2 = ComponentArray(p_NN_β)
const _st_NN_β = st_NN_β
# Multilayer FeedForward
const NN_Tg = Lux.Chain(Lux.Dense(1, 5, rbf), Lux.Dense(5, 5, rbf), Lux.Dense(5, 5, rbf),
Lux.Dense(5, 1))
# Randomly initialize the initial parameters and state variables of the model
p_NN_Tg, st_NN_Tg = Lux.setup(rng, NN_Tg)
p_NN_Tg2 = ComponentArray(p_NN_Tg)
const _st_NN_Tg = st_NN_Tg
p_all = ComponentArray(; p_NN_β2, p_NN_Tg2)
# To avoid values outside [0,1] range
f_β = x -> max(min(1.0, x), 0.0)
f_Tg = x -> max(min(1.0, x), 0.0)
# Scaling
wr_max = 1.0472
Tg_max = 2.159e7
β_max = 90
wr_data_scaled = wr_data ./ wr_max
Tg_data_scaled = Tg_data ./ Tg_max
β_data_scaled = β_data ./ β_max
# Define model constants
μd = 0.05
Rr = 120.998
ρ = 1.225
Ar = π * Rr^2
Jr = 321699000
Jg = 3.777e6
c1 = 0.1661
c2 = 115.9810
c3 = 1.4138
c4 = 5.3663
c5 = 16.2881
c6 = 0.0410
tau_Tg = 0.05
tau_β = 0.05
wr_rated = 0.7917 / wr_max
# Define hybrid wind turbine model with wind speed as exogenous input
function wind_turbine_ude!(du, u, p, t)
# States
wr = u[1]
Tg = u[2]
β = u[3]
# Error
wr_err = wr - wr_rated
# Controller
Tg_ref = NN_Tg([wr_err], p.p_NN_β2, _st_NN_β)[1][1]
β_ref = NN_β([wr_err], p.p_NN_Tg2, _st_NN_Tg)[1][1]
# Calculate tip-speed ratio λ
λ = (wr*wr_max) * Rr / vr_interp(t)
# Calculate 1/λi (numerical approx. for Cp(θ, λ))
λi_inv = 1 / (λ + 0.08 * (β*β_max)) - 0.035 / ((β*β_max)^3 + 1)
# Calculate Cp (numerical approx. for Cp(θ, λ))
Cp = max(0, c1 * (c2 * λi_inv - c3 * (β*β_max) - c4) * exp(-c5 * λi_inv) + c6 * λ)
# Calculate aerodynamic torque
Tr = 1 / (2 * (wr*wr_max)) * ρ * Ar * vr_interp(t)^3 * Cp # max(0, Cp)
# State derivatives
du[1] = ((1 - μd) * Tr - (Tg*Tg_max)) / (Jr + Jg) / wr_max
du[2] = ((Tg_ref*Tg_max) - (Tg*Tg_max)) / tau_Tg / Tg_max
du[3] = ((β_ref*β_max) - (β*β_max)) / tau_β / β_max
end
# Define initial-value problem.
wr0 = wr_data_scaled[1]
Tg0 = Tg_data_scaled[1]
vr0 = vr_data[1]
β0 = β_data_scaled[1]
# Define the problem
prob_UDE = ODEProblem(wind_turbine_ude!, [wr0, Tg0, β0], tspan, p_all)
function predict(θ, X = [wr0, Tg0, β0], T = t_data)
_prob = remake(prob_UDE, u0 = X, tspan = (T[1], T[end]), p = θ)
Array(solve(_prob, AutoTsit5(Rosenbrock23()), saveat = T,
abstol = 1e-6, reltol = 1e-6,
sensealg=ForwardDiffSensitivity()))
end
# Transpose data
wr_data_scaled2 = reshape(wr_data_scaled, 1, :)
Tg_data_scaled2 = reshape(Tg_data_scaled, 1, :)
β_data_scaled2 = reshape(β_data_scaled, 1, :)
all_data_scaled = vcat(wr_data_scaled2, Tg_data_scaled2, β_data_scaled2)
function loss(θ)
X̂ = predict(θ)
mean(abs2, all_data_scaled .- X̂)
end
losses = Float64[]
callback = function (p, l)
push!(losses, l)
if length(losses) % 10 == 0
println("Current loss after $(length(losses)) iterations: $(losses[end])")
end
return false
end
adtype = Optimization.AutoZygote()
optf = Optimization.OptimizationFunction((x, p) -> loss(x), adtype)
optprob = Optimization.OptimizationProblem(optf, ComponentVector{Float64}(p_all))
res1 = Optimization.solve(optprob, ADAM(0.1), callback = callback, maxiters = 1000)
println("Training loss after $(length(losses)) iterations: $(losses[end])")