I’m having trouble understanding how to enter the trainable parameters into DiffEqFlux.
In order to understand how DiffEqFlux works, I set myself the following exercise: Replace p = [1.5, 1.0, 3.0, 1.0]
in the example of optimizing an ODE, with a neural network that learns the parameters as a function of an input vector (say, u0
, but it could be anything).
(This is, of course, an impractical complication, the goal is that it prove an instructive exercise.)
The problem I’m running into is that I can’t figure out how to enter the resulting parameters into DiffEqFlux.sciml_train
.
My code is as follows:
#Copied from the manual
using DifferentialEquations, Flux, Optim, DiffEqFlux, DiffEqSensitivity, Plots
function lotka_volterra!(du, u, p, t)
x, y = u
α, β, δ, γ = p
du[1] = dx = α*x - β*x*y
du[2] = dy = -δ*y + γ*x*y
end
#My neural net (by hand for the sake of the exercise):
W1 = rand(5,2)
b1 = rand(5)
W2 = rand(4,5)
b2 = rand(4)
model(x) = relu.(W2*relu.(W1*x + b1) + b2)
#Copied from the manual
tspan = (0.0, 10.0)
tsteps = 0.0:0.1:10.0
#Copied from the manual, but with p replaced by model(u0)).
prob = ODEProblem(lotka_volterra!, u0, tspan, model(u0))
sol = solve(prob, Tsit5())
using Plots
plot(sol) #Generates a plot using the model.
function loss()
sol = solve(prob, Tsit5(), p=#=Can't see what goes here=#, saveat = tsteps)
loss = sum(abs2, sol.-1)
return loss, sol
end
callback = function (p, l, pred)
display(l)
plt = plot(pred, ylim = (0, 6))
display(plt)
return false
end
result_ode = DiffEqFlux.sciml_train(loss,
#=params(W1,W2,b1,b2) returns error=#,
ADAM(0.1),
cb = callback,
maxiters = 100)
Anyway, I’m kinda at a loss here for how to handle the parameters. I can’t put params(W1,W2,b1,2)
into the sciml_train
function, since it returns an error:
ERROR: MethodError: no method matching copy(::Zygote.Params)
Closest candidates are:
copy(::Expr) at expr.jl:36
copy(::Core.CodeInfo) at expr.jl:64
copy(::BitSet) at bitset.jl:46
...
Stacktrace:
[1] sciml_train(::Function, ::Zygote.Params, ::ADAM, ::Base.Iterators.Cycle{Tuple{DiffEqFlux.NullData}}; cb::Function, maxiters::Int64, progress::Bool, save_best::Bool) at C:\Users\MattProgramming\.juliapro\JuliaPro_v1.4.2-1\packages\DiffEqFlux\FZMwP\src\train.jl:77
[2] top-level scope at none:0
And if I put in the parameters explicitly (e.g., using W1 for the parameter), nothing trains.
(I think I may be able to solve this problem as a neural differential equation, by chaining lotka_volterra
into the neural differential equation, but this problem doesn’t seem, in principle, any more complicated than the manual example. Except I can’t figure out how to tell it which parameters to use.)