Tracking initial condition to optimize starting value too Diffeqflux

Hello,
i am curious whether i can use existing tooling to optimize initial conditions too.
So far i only got the error:
**MethodError: no method matching OrdinaryDiffEq.Tsit5Cache(::TrackedArray{…,Array{Float64,1}}, ::TrackedArray{…,Array{Float64,1}}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Tracker.TrackedReal{Float64},1}, ::OrdinaryDiffEq.Tsit5ConstantCache{Float64,Float64})**

Minimal working example:

using Flux, DiffEqFlux, DifferentialEquations, Plots, LinearAlgebra


GDP = [11394358246872.6, 11886411296037.9, 12547852149499.6, 13201781525927, 14081902622923.3, 14866223429278.3, 15728198883149.2, 16421593575529.9, 17437921118338, 18504710349537.1, 19191754995907.1, 20025063402734.2, 21171619915190.4, 22549236163304.4, 22999815176366.2, 23138798276196.2, 24359046058098.6, 25317009721600.9, 26301669369287.8, 27386035164588.8, 27907493159394.4, 28445139283067.1, 28565588996657.6, 29255060755937.6, 30574152048605.8, 31710451102539.4, 32786657119472.8, 34001004119223.5, 35570841010027.7, 36878317437617.5, 37952345258555.4, 38490918890678.7, 39171116855465.5, 39772082901255.8, 40969517920094.4, 42210614326789.4, 43638265675924.6, 45254805649669.6, 46411399944618.2, 47929948653387.3, 50036361141742.2, 51009550274808.6, 52127765360545.5, 53644090247696.9, 55995239099025.6, 58161311618934.2, 60681422072544.7, 63240595965946.1, 64413060738139.7, 63326658023605.9, 66036918504601.7, 68100669928597.9, 69811348331640.1, 71662400667935.7, 73698404958519.1, 75802901433146, 77752106717302.4, 80209237761564.8, 82643194654568.3]
function monomial(dcGDP, cGDP, parameters, t)
    α1, β1, nu1, nu2, δ, δ2 = parameters

    dcGDP[1] = α1 * ((cGDP[1]))^β1
end


GDP0 = GDP[1]

tspan = (1.0, 59.0)
p = param([474.8501513113645, 0.7036417845990167, 0.0, 1e-10, 1e-10, 1e-10])


if false
    prob = ODEProblem(monomial,[GDP0],tspan,p)
else ## false crashes. that is when i am tracking the initial conditions
    prob = ODEProblem(monomial,param([GDP0]),tspan,p)
end
function predict_rd() # Our 1-layer neural network
  diffeq_rd(p,prob,Tsit5(),saveat=1.0)
end

function loss_rd() ##L2 norm biases the newer times unfairly
    ##Graph looks better if we minimize relative error squared
    c = 0.0
    a = predict_rd()
    d = 0.0
    for i=1:59
        c += (a(i)[1]/GDP[i]-1)^2 ## L2 of relative error
    end
    c + 3 * d
end

data = Iterators.repeated((), 1000)
opt = ADAM(0.00000005)

peek = function () #callback function to observe training
    #reduces training speed by a lot
    plot(1:59,GDP,legend=false)
  display(plot!(solve(remake(prob,p=Flux.data(p)),Tsit5(),saveat=1.0),linewidth=0.8))
  plot!(1:59,GDP)
  println("Loss: ",loss_rd())
end

peek()
for i=1:100
  Flux.train!(loss_rd, [p], data, opt)
  peek()
end

Am i doing it wrong?
How else can i leave the intial conditions up for optimization?

Thank you.
(The code is based on similar code i posted recently but i thought i better address my problems one by one.)

If you want to use reverse mode autodiff here, I would suggest using it like:

using Flux, DiffEqFlux, DifferentialEquations, Plots, LinearAlgebra


GDP = [11394358246872.6, 11886411296037.9, 12547852149499.6, 13201781525927, 14081902622923.3, 14866223429278.3, 15728198883149.2, 16421593575529.9, 17437921118338, 18504710349537.1, 19191754995907.1, 20025063402734.2, 21171619915190.4, 22549236163304.4, 22999815176366.2, 23138798276196.2, 24359046058098.6, 25317009721600.9, 26301669369287.8, 27386035164588.8, 27907493159394.4, 28445139283067.1, 28565588996657.6, 29255060755937.6, 30574152048605.8, 31710451102539.4, 32786657119472.8, 34001004119223.5, 35570841010027.7, 36878317437617.5, 37952345258555.4, 38490918890678.7, 39171116855465.5, 39772082901255.8, 40969517920094.4, 42210614326789.4, 43638265675924.6, 45254805649669.6, 46411399944618.2, 47929948653387.3, 50036361141742.2, 51009550274808.6, 52127765360545.5, 53644090247696.9, 55995239099025.6, 58161311618934.2, 60681422072544.7, 63240595965946.1, 64413060738139.7, 63326658023605.9, 66036918504601.7, 68100669928597.9, 69811348331640.1, 71662400667935.7, 73698404958519.1, 75802901433146, 77752106717302.4, 80209237761564.8, 82643194654568.3]
function monomial(cGDP, parameters, t)
    α1, β1, nu1, nu2, δ, δ2 = parameters

    [α1 * ((cGDP[1]))^β1]
end
function monomial(cGDP::TrackedArray, parameters, t)
    α1, β1, nu1, nu2, δ, δ2 = parameters

    Tracker.collect([α1 * ((cGDP[1]))^β1])
end



GDP0 = GDP[1]

tspan = (1.0, 59.0)
p = param([474.8501513113645, 0.7036417845990167, 0.0, 1e-10, 1e-10, 1e-10])
u0 = param([GDP0])
if false
    prob = ODEProblem(monomial,[GDP0],tspan,p)
else ## false crashes. that is when i am tracking the initial conditions
    prob = ODEProblem(monomial,u0,tspan,p)
end
function predict_rd() # Our 1-layer neural network
  diffeq_rd(p,prob,Tsit5(),saveat=1.0:1.0:59.0)
end

function loss_rd() ##L2 norm biases the newer times unfairly
    ##Graph looks better if we minimize relative error squared
    c = 0.0
    a = predict_rd()
    d = 0.0
    for i=1:59
        c += (a[i][1]/GDP[i]-1)^2 ## L2 of relative error
    end
    c + 3 * d
end

data = Iterators.repeated((), 100)
opt = ADAM(0.01)

peek = function () #callback function to observe training
    #reduces training speed by a lot
    plot(1:59,GDP,legend=false)
    display(plot!(solve(remake(prob,p=Flux.data(p),u0=Flux.data(u0)),Tsit5(),saveat=1.0),linewidth=0.8))
    plot!(1:59,GDP)
    println("Loss: ",loss_rd())
end

peek()
Flux.train!(loss_rd, Flux.Params([p,u0]), data, opt, cb=peek)
peek()

However, I might suggest using adjoints instead if you’re going to put a neural network in there. That would look like:

using Flux, DiffEqFlux, DifferentialEquations, Plots, LinearAlgebra


GDP = [11394358246872.6, 11886411296037.9, 12547852149499.6, 13201781525927, 14081902622923.3, 14866223429278.3, 15728198883149.2, 16421593575529.9, 17437921118338, 18504710349537.1, 19191754995907.1, 20025063402734.2, 21171619915190.4, 22549236163304.4, 22999815176366.2, 23138798276196.2, 24359046058098.6, 25317009721600.9, 26301669369287.8, 27386035164588.8, 27907493159394.4, 28445139283067.1, 28565588996657.6, 29255060755937.6, 30574152048605.8, 31710451102539.4, 32786657119472.8, 34001004119223.5, 35570841010027.7, 36878317437617.5, 37952345258555.4, 38490918890678.7, 39171116855465.5, 39772082901255.8, 40969517920094.4, 42210614326789.4, 43638265675924.6, 45254805649669.6, 46411399944618.2, 47929948653387.3, 50036361141742.2, 51009550274808.6, 52127765360545.5, 53644090247696.9, 55995239099025.6, 58161311618934.2, 60681422072544.7, 63240595965946.1, 64413060738139.7, 63326658023605.9, 66036918504601.7, 68100669928597.9, 69811348331640.1, 71662400667935.7, 73698404958519.1, 75802901433146, 77752106717302.4, 80209237761564.8, 82643194654568.3]
function monomial(dcGDP, cGDP, parameters, t)
    α1, β1, nu1, nu2, δ, δ2 = parameters

    dcGDP[1] = α1 * ((cGDP[1]))^β1
end


GDP0 = GDP[1]

tspan = (1.0, 59.0)
p = param([474.8501513113645, 0.7036417845990167, 0.0, 1e-10, 1e-10, 1e-10])
u0 = param([GDP0])
if false
    prob = ODEProblem(monomial,[GDP0],tspan,p)
else ## false crashes. that is when i am tracking the initial conditions
    prob = ODEProblem(monomial,u0,tspan,p)
end
function predict_adjoint() # Our 1-layer neural network
  diffeq_adjoint(p,prob,Tsit5(),saveat=1.0)
end

function loss_adjoint() ##L2 norm biases the newer times unfairly
    ##Graph looks better if we minimize relative error squared
    c = 0.0
    a = predict_adjoint()
    d = 0.0
    for i=1:59
        c += (a[i][1]/GDP[i]-1)^2 ## L2 of relative error
    end
    c + 3 * d
end

data = Iterators.repeated((), 100)
opt = ADAM(0.01)

peek = function () #callback function to observe training
    #reduces training speed by a lot
    plot(1:59,GDP,legend=false)
    display(plot!(solve(remake(prob,p=Flux.data(p),u0=Flux.data(u0)),Tsit5(),saveat=1.0),linewidth=0.8))
    plot!(1:59,GDP)
    println("Loss: ",loss_adjoint())
end

peek()
Flux.train!(loss_adjoint, Flux.Params([p,u0]), data, opt, cb=peek)
peek()
1 Like

Now that initial conditions are updated i got a few more questions:
Would it make sense to implement multiple shooting using this?
(Optimize the differential equations over different time intervals and also optimize how “continuous” the function-family at the contact points is)

The differential equation has finite escape time if the exponent > 1
I also plan to train a more unstable version:

function simplemonolinearkoef(dcGDP, cGDP, parameters, t)
    α1, β1, nu, nu1, δ, δ2 = parameters
    dcGDP[1] = α1 * ((cGDP[1]))^(β1*(1 + δ*(t) + δ2*(t^2)))
end

Any recommendations how i can handle a failure of the integrator due to explosion?
Infinite error (which is what happens mathematically) is not a useful loss to back propagate.
Can you recommend something to read about adjoint vs reverse/forward mode autodiff?

sol.retcode != :Success and give it an infinite loss? I am not sure how Flux will do there but it’s worth a try.

Just did a major docs update: GitHub - SciML/DiffEqFlux.jl: Universal neural differential equations with O(1) backprop, GPUs, and stiff+non-stiff DE solvers, demonstrating scientific machine learning (SciML) and physics-informed machine learning methods

As expected it can not handle infinite loss and fails with a obvious error:

Loss is infinite
in top-level scope at base/none
in  at base/none
in #train!#12 at Flux/qXNjB/src/optimise/train.jl:69
in macro expansion at Juno/TfNYn/src/progress.jl:124 
in macro expansion at Flux/qXNjB/src/optimise/train.jl:71 
in gradient at Tracker/RRYy6/src/back.jl:164 
in #gradient#24 at Tracker/RRYy6/src/back.jl:164
in gradient_ at Tracker/RRYy6/src/back.jl:98
in losscheck at Tracker/RRYy6/src/back.jl:154

Does anybody have an idea how to solve “the “incomplete integration” makes L² norm unable to judge how bad it is”?
The problem is that there is no “degree” of failure. Either the solution explodes or not.

I am sorry that i had to remove the answer tick but a wrong initial condition is not changed.
I ran it with

GDP0 = 1.0e12

and u0 was unchanged after training. This statement holds for the reverse mode autodiff with the code unchaged expect the one line seen above.
You adjoint fails at, with; independent of my change

Flux.train!(loss_adjoint, Flux.Params([p,u0]), data, opt, cb=peek)

BoundsError: attempt to access 0-element StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}} at index [0]

Stacktrace:
 [1] throw_boundserror(::StepRangeLen{Float64,Base.TwicePrecision{Float64},Base.TwicePrecision{Float64}}, ::Tuple{Int64}) at ./abstractarray.jl:484
 [2] checkbounds at ./abstractarray.jl:449 [inlined]
 [3] getindex at ./range.jl:619 [inlined]
 [4] tstop_saveat_disc_handling at /home/joto/.julia/packages/OrdinaryDiffEq/tat1c/src/solve.jl:407 [inlined]
 [5] #__init#297(::Float64, ::Array{Float64,1}, ::Array{Float64,1}, ::Nothing, ::Bool, ::Nothing, ::Bool, ::Bool, ::Bool, ::Nothing, ::Bool, ::Bool, ::Float64, ::Bool, ::Rational{Int64}, ::Float64, ::Float64, ::Int64, ::Rational{Int64}, ::Int64, ::Int64, ::Rational{Int64}, ::Bool, ::Int64, ::Nothing, ::Nothing, ::Int64, ::Float64, ::Float64, ::typeof(DiffEqBase.ODE_DEFAULT_NORM), ::typeof(opnorm), ::typeof(DiffEqBase.ODE_DEFAULT_ISOUTOFDOMAIN), ::typeof(DiffEqBase.ODE_DEFAULT_UNSTABLE_CHECK), ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool, ::Int64, ::String, ::typeof(DiffEqBase.ODE_DEFAULT_PROG_MESSAGE), ::Nothing, ::Bool, ::Bool, ::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(DiffEqBase.__init), ::ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,DiffEqSensitivity.ODEAdjointSensitivityFunction{Array{Float64,1},Array{Float64,1},Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,DiffEqSensitivity.SensitivityAlg{0,true,Val{:central}},Nothing,UniformScaling{Bool},Nothing,Nothing,Nothing,Array{Float64,1},ODESolution{Float64,2,Array{Array{Float64,1},1},Nothing,Nothing,Array{Float64,1},Array{Array{Array{Float64,1},1},1},ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},Tsit5,OrdinaryDiffEq.InterpolationData{ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Array{Array{Float64,1},1},Array{Float64,1},Array{Array{Array{Float64,1},1},1},OrdinaryDiffEq.Tsit5Cache{Array{Float64,1},Array{Float64,1},Array{Float64,1},OrdinaryDiffEq.Tsit5ConstantCache{Float64,Float64}}},DiffEqBase.DEStats},Nothing},UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},CallbackSet{Tuple{},Tuple{DiscreteCallback{getfield(DiffEqCallbacks, Symbol("##31#34")){Base.RefValue{Union{Nothing, Float64}}},getfield(DiffEqCallbacks, Symbol("##32#35")){getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},getfield(DiffEqSensitivity, Symbol("##19#21")){getfield(DiffEqFlux, Symbol("#df#27")),Bool,Array{Float64,1},Array{Float64,1},Base.RefValue{Int64},Int64},Base.RefValue{Union{Nothing, Float64}}},getfield(DiffEqCallbacks, Symbol("##33#36")){typeof(DiffEqBase.INITIALIZE_DEFAULT),Bool,getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},Base.RefValue{Union{Nothing, Float64}},getfield(DiffEqCallbacks, Symbol("##32#35")){getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},getfield(DiffEqSensitivity, Symbol("##19#21")){getfield(DiffEqFlux, Symbol("#df#27")),Bool,Array{Float64,1},Array{Float64,1},Base.RefValue{Int64},Int64},Base.RefValue{Union{Nothing, Float64}}}}}}},DiffEqBase.StandardODEProblem}, ::Tsit5, ::Array{Any,1}, ::Array{Any,1}, ::Array{Any,1}, ::Type{Val{true}}) at /home/joto/.julia/packages/OrdinaryDiffEq/tat1c/src/solve.jl:154
 [6] (::getfield(DiffEqBase, Symbol("#kw##__init")))(::NamedTuple{(:abstol, :reltol, :save_everystep, :save_start, :saveat),Tuple{Float64,Float64,Bool,Bool,Float64}}, ::typeof(DiffEqBase.__init), ::ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,DiffEqSensitivity.ODEAdjointSensitivityFunction{Array{Float64,1},Array{Float64,1},Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,DiffEqSensitivity.SensitivityAlg{0,true,Val{:central}},Nothing,UniformScaling{Bool},Nothing,Nothing,Nothing,Array{Float64,1},ODESolution{Float64,2,Array{Array{Float64,1},1},Nothing,Nothing,Array{Float64,1},Array{Array{Array{Float64,1},1},1},ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},Tsit5,OrdinaryDiffEq.InterpolationData{ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Array{Array{Float64,1},1},Array{Float64,1},Array{Array{Array{Float64,1},1},1},OrdinaryDiffEq.Tsit5Cache{Array{Float64,1},Array{Float64,1},Array{Float64,1},OrdinaryDiffEq.Tsit5ConstantCache{Float64,Float64}}},DiffEqBase.DEStats},Nothing},UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},CallbackSet{Tuple{},Tuple{DiscreteCallback{getfield(DiffEqCallbacks, Symbol("##31#34")){Base.RefValue{Union{Nothing, Float64}}},getfield(DiffEqCallbacks, Symbol("##32#35")){getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},getfield(DiffEqSensitivity, Symbol("##19#21")){getfield(DiffEqFlux, Symbol("#df#27")),Bool,Array{Float64,1},Array{Float64,1},Base.RefValue{Int64},Int64},Base.RefValue{Union{Nothing, Float64}}},getfield(DiffEqCallbacks, Symbol("##33#36")){typeof(DiffEqBase.INITIALIZE_DEFAULT),Bool,getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},Base.RefValue{Union{Nothing, Float64}},getfield(DiffEqCallbacks, Symbol("##32#35")){getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},getfield(DiffEqSensitivity, Symbol("##19#21")){getfield(DiffEqFlux, Symbol("#df#27")),Bool,Array{Float64,1},Array{Float64,1},Base.RefValue{Int64},Int64},Base.RefValue{Union{Nothing, Float64}}}}}}},DiffEqBase.StandardODEProblem}, ::Tsit5, ::Array{Any,1}, ::Array{Any,1}, ::Array{Any,1}, ::Type{Val{true}}) at ./none:0
 [7] #__solve#296(::Base.Iterators.Pairs{Symbol,Real,NTuple{5,Symbol},NamedTuple{(:abstol, :reltol, :save_everystep, :save_start, :saveat),Tuple{Float64,Float64,Bool,Bool,Float64}}}, ::Function, ::ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,DiffEqSensitivity.ODEAdjointSensitivityFunction{Array{Float64,1},Array{Float64,1},Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,DiffEqSensitivity.SensitivityAlg{0,true,Val{:central}},Nothing,UniformScaling{Bool},Nothing,Nothing,Nothing,Array{Float64,1},ODESolution{Float64,2,Array{Array{Float64,1},1},Nothing,Nothing,Array{Float64,1},Array{Array{Array{Float64,1},1},1},ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},Tsit5,OrdinaryDiffEq.InterpolationData{ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Array{Array{Float64,1},1},Array{Float64,1},Array{Array{Array{Float64,1},1},1},OrdinaryDiffEq.Tsit5Cache{Array{Float64,1},Array{Float64,1},Array{Float64,1},OrdinaryDiffEq.Tsit5ConstantCache{Float64,Float64}}},DiffEqBase.DEStats},Nothing},UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},CallbackSet{Tuple{},Tuple{DiscreteCallback{getfield(DiffEqCallbacks, Symbol("##31#34")){Base.RefValue{Union{Nothing, Float64}}},getfield(DiffEqCallbacks, Symbol("##32#35")){getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},getfield(DiffEqSensitivity, Symbol("##19#21")){getfield(DiffEqFlux, Symbol("#df#27")),Bool,Array{Float64,1},Array{Float64,1},Base.RefValue{Int64},Int64},Base.RefValue{Union{Nothing, Float64}}},getfield(DiffEqCallbacks, Symbol("##33#36")){typeof(DiffEqBase.INITIALIZE_DEFAULT),Bool,getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},Base.RefValue{Union{Nothing, Float64}},getfield(DiffEqCallbacks, Symbol("##32#35")){getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},getfield(DiffEqSensitivity, Symbol("##19#21")){getfield(DiffEqFlux, Symbol("#df#27")),Bool,Array{Float64,1},Array{Float64,1},Base.RefValue{Int64},Int64},Base.RefValue{Union{Nothing, Float64}}}}}}},DiffEqBase.StandardODEProblem}, ::Tsit5, ::Array{Any,1}, ::Array{Any,1}, ::Array{Any,1}, ::Type{Val{true}}) at /home/joto/.julia/packages/OrdinaryDiffEq/tat1c/src/solve.jl:6
 [8] #__solve at ./none:0 [inlined] (repeats 5 times)
 [9] #solve#364(::Base.Iterators.Pairs{Symbol,Real,NTuple{5,Symbol},NamedTuple{(:abstol, :reltol, :save_everystep, :save_start, :saveat),Tuple{Float64,Float64,Bool,Bool,Float64}}}, ::Function, ::ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,DiffEqSensitivity.ODEAdjointSensitivityFunction{Array{Float64,1},Array{Float64,1},Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,DiffEqSensitivity.SensitivityAlg{0,true,Val{:central}},Nothing,UniformScaling{Bool},Nothing,Nothing,Nothing,Array{Float64,1},ODESolution{Float64,2,Array{Array{Float64,1},1},Nothing,Nothing,Array{Float64,1},Array{Array{Array{Float64,1},1},1},ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},Tsit5,OrdinaryDiffEq.InterpolationData{ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Array{Array{Float64,1},1},Array{Float64,1},Array{Array{Array{Float64,1},1},1},OrdinaryDiffEq.Tsit5Cache{Array{Float64,1},Array{Float64,1},Array{Float64,1},OrdinaryDiffEq.Tsit5ConstantCache{Float64,Float64}}},DiffEqBase.DEStats},Nothing},UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},CallbackSet{Tuple{},Tuple{DiscreteCallback{getfield(DiffEqCallbacks, Symbol("##31#34")){Base.RefValue{Union{Nothing, Float64}}},getfield(DiffEqCallbacks, Symbol("##32#35")){getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},getfield(DiffEqSensitivity, Symbol("##19#21")){getfield(DiffEqFlux, Symbol("#df#27")),Bool,Array{Float64,1},Array{Float64,1},Base.RefValue{Int64},Int64},Base.RefValue{Union{Nothing, Float64}}},getfield(DiffEqCallbacks, Symbol("##33#36")){typeof(DiffEqBase.INITIALIZE_DEFAULT),Bool,getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},Base.RefValue{Union{Nothing, Float64}},getfield(DiffEqCallbacks, Symbol("##32#35")){getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},getfield(DiffEqSensitivity, Symbol("##19#21")){getfield(DiffEqFlux, Symbol("#df#27")),Bool,Array{Float64,1},Array{Float64,1},Base.RefValue{Int64},Int64},Base.RefValue{Union{Nothing, Float64}}}}}}},DiffEqBase.StandardODEProblem}, ::Tsit5) at /home/joto/.julia/packages/DiffEqBase/dYU4n/src/solve.jl:39
 [10] (::getfield(DiffEqBase, Symbol("#kw##solve")))(::NamedTuple{(:abstol, :reltol, :save_everystep, :save_start, :saveat),Tuple{Float64,Float64,Bool,Bool,Float64}}, ::typeof(solve), ::ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,DiffEqSensitivity.ODEAdjointSensitivityFunction{Array{Float64,1},Array{Float64,1},Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,DiffEqSensitivity.SensitivityAlg{0,true,Val{:central}},Nothing,UniformScaling{Bool},Nothing,Nothing,Nothing,Array{Float64,1},ODESolution{Float64,2,Array{Array{Float64,1},1},Nothing,Nothing,Array{Float64,1},Array{Array{Array{Float64,1},1},1},ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},Tsit5,OrdinaryDiffEq.InterpolationData{ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Array{Array{Float64,1},1},Array{Float64,1},Array{Array{Array{Float64,1},1},1},OrdinaryDiffEq.Tsit5Cache{Array{Float64,1},Array{Float64,1},Array{Float64,1},OrdinaryDiffEq.Tsit5ConstantCache{Float64,Float64}}},DiffEqBase.DEStats},Nothing},UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},CallbackSet{Tuple{},Tuple{DiscreteCallback{getfield(DiffEqCallbacks, Symbol("##31#34")){Base.RefValue{Union{Nothing, Float64}}},getfield(DiffEqCallbacks, Symbol("##32#35")){getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},getfield(DiffEqSensitivity, Symbol("##19#21")){getfield(DiffEqFlux, Symbol("#df#27")),Bool,Array{Float64,1},Array{Float64,1},Base.RefValue{Int64},Int64},Base.RefValue{Union{Nothing, Float64}}},getfield(DiffEqCallbacks, Symbol("##33#36")){typeof(DiffEqBase.INITIALIZE_DEFAULT),Bool,getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},Base.RefValue{Union{Nothing, Float64}},getfield(DiffEqCallbacks, Symbol("##32#35")){getfield(DiffEqSensitivity, Symbol("#time_choice#20")){Array{Float64,1},Base.RefValue{Int64}},getfield(DiffEqSensitivity, Symbol("##19#21")){getfield(DiffEqFlux, Symbol("#df#27")),Bool,Array{Float64,1},Array{Float64,1},Base.RefValue{Int64},Int64},Base.RefValue{Union{Nothing, Float64}}}}}}},DiffEqBase.StandardODEProblem}, ::Tsit5) at ./none:0
 [11] #adjoint_sensitivities_u0#24(::Float64, ::Float64, ::Float64, ::Float64, ::DiffEqSensitivity.SensitivityAlg{0,true,Val{:central}}, ::Array{Float64,1}, ::Base.Iterators.Pairs{Symbol,Float64,Tuple{Symbol},NamedTuple{(:saveat,),Tuple{Float64}}}, ::Function, ::ODESolution{Float64,2,Array{Array{Float64,1},1},Nothing,Nothing,Array{Float64,1},Array{Array{Array{Float64,1},1},1},ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},Tsit5,OrdinaryDiffEq.InterpolationData{ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Array{Array{Float64,1},1},Array{Float64,1},Array{Array{Array{Float64,1},1},1},OrdinaryDiffEq.Tsit5Cache{Array{Float64,1},Array{Float64,1},Array{Float64,1},OrdinaryDiffEq.Tsit5ConstantCache{Float64,Float64}}},DiffEqBase.DEStats}, ::Tsit5, ::getfield(DiffEqFlux, Symbol("#df#27")), ::Array{Float64,1}, ::Nothing) at /home/joto/.julia/packages/DiffEqSensitivity/1T4VA/src/adjoint_sensitivity.jl:331
 [12] (::getfield(DiffEqFlux, Symbol("##24#26")){DiffEqSensitivity.SensitivityAlg{0,true,Val{:central}},Base.Iterators.Pairs{Symbol,Float64,Tuple{Symbol},NamedTuple{(:saveat,),Tuple{Float64}}},TrackedArray{Float64,1,Array{Float64,1}},Tuple{Tsit5},ODESolution{Float64,2,Array{Array{Float64,1},1},Nothing,Nothing,Array{Float64,1},Array{Array{Array{Float64,1},1},1},ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},Tsit5,OrdinaryDiffEq.InterpolationData{ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Array{Array{Float64,1},1},Array{Float64,1},Array{Array{Array{Float64,1},1},1},OrdinaryDiffEq.Tsit5Cache{Array{Float64,1},Array{Float64,1},Array{Float64,1},OrdinaryDiffEq.Tsit5ConstantCache{Float64,Float64}}},DiffEqBase.DEStats}})(::Array{Float64,2}) at ./none:0
 [13] back_(::Tracker.Call{getfield(DiffEqFlux, Symbol("##24#26")){DiffEqSensitivity.SensitivityAlg{0,true,Val{:central}},Base.Iterators.Pairs{Symbol,Float64,Tuple{Symbol},NamedTuple{(:saveat,),Tuple{Float64}}},TrackedArray{Float64,1,Array{Float64,1}},Tuple{Tsit5},ODESolution{Float64,2,Array{Array{Float64,1},1},Nothing,Nothing,Array{Float64,1},Array{Array{Array{Float64,1},1},1},ODEProblem{Array{Float64,1},Tuple{Float64,Float64},true,Array{Float64,1},ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Nothing,DiffEqBase.StandardODEProblem},Tsit5,OrdinaryDiffEq.InterpolationData{ODEFunction{true,typeof(monomial),UniformScaling{Bool},Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing,Nothing},Array{Array{Float64,1},1},Array{Float64,1},Array{Array{Array{Float64,1},1},1},OrdinaryDiffEq.Tsit5Cache{Array{Float64,1},Array{Float64,1},Array{Float64,1},OrdinaryDiffEq.Tsit5ConstantCache{Float64,Float64}}},DiffEqBase.DEStats}},Tuple{Tracker.Tracked{Array{Float64,1}},Tracker.Tracked{Array{Float64,1}},Nothing,Nothing}}, ::Array{Float64,2}, ::Bool) at /home/joto/.julia/packages/Tracker/RRYy6/src/back.jl:35
 [14] back(::Tracker.Tracked{Array{Float64,2}}, ::Array{Float64,2}, ::Bool) at /home/joto/.julia/packages/Tracker/RRYy6/src/back.jl:58
 [15] #13 at /home/joto/.julia/packages/Tracker/RRYy6/src/back.jl:38 [inlined]
 [16] foreach at ./abstractarray.jl:1836 [inlined]
 [17] back_(::Tracker.Call{getfield(Tracker, Symbol("##372#374")){Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},TrackedArray{Float64,2,Array{Float64,2}},Tuple{Int64}},Tuple{Tracker.Tracked{Array{Float64,2}},Nothing}}, ::Float64, ::Bool) at /home/joto/.julia/packages/Tracker/RRYy6/src/back.jl:38
 ... (the last 4 lines are repeated 5 more times)
 [38] back(::Tracker.Tracked{Float64}, ::Int64, ::Bool) at /home/joto/.julia/packages/Tracker/RRYy6/src/back.jl:58
 [39] #back!#15 at /home/joto/.julia/packages/Tracker/RRYy6/src/back.jl:77 [inlined]
 [40] #back! at ./none:0 [inlined]
 [41] #back!#32 at /home/joto/.julia/packages/Tracker/RRYy6/src/lib/real.jl:16 [inlined]
 [42] back!(::Tracker.TrackedReal{Float64}) at /home/joto/.julia/packages/Tracker/RRYy6/src/lib/real.jl:14
 [43] gradient_(::getfield(Flux.Optimise, Symbol("##14#20")){typeof(loss_adjoint)}, ::Tracker.Params) at /home/joto/.julia/packages/Tracker/RRYy6/src/back.jl:4
 [44] #gradient#24(::Bool, ::Function, ::Function, ::Tracker.Params) at /home/joto/.julia/packages/Tracker/RRYy6/src/back.jl:164
 [45] gradient at /home/joto/.julia/packages/Tracker/RRYy6/src/back.jl:164 [inlined]
 [46] macro expansion at /home/joto/.julia/packages/Flux/qXNjB/src/optimise/train.jl:71 [inlined]
 [47] macro expansion at /home/joto/.julia/packages/Juno/TfNYn/src/progress.jl:124 [inlined]
 [48] #train!#12(::getfield(Main, Symbol("##9#10")), ::Function, ::Function, ::Tracker.Params, ::Base.Iterators.Take{Base.Iterators.Repeated{Tuple{}}}, ::ADAM) at /home/joto/.julia/packages/Flux/qXNjB/src/optimise/train.jl:69
 [49] (::getfield(Flux.Optimise, Symbol("#kw##train!")))(::NamedTuple{(:cb,),Tuple{getfield(Main, Symbol("##9#10"))}}, ::typeof(Flux.Optimise.train!), ::Function, ::Tracker.Params, ::Base.Iterators.Take{Base.Iterators.Repeated{Tuple{}}}, ::ADAM) at ./none:0
 [50] top-level scope at none:0
 [51] (::getfield(Atom, Symbol("##126#131")){String,String,Module})() at /home/joto/.julia/packages/Atom/cR6bU/src/eval.jl:125
 [52] withpath(::getfield(Atom, Symbol("##126#131")){String,String,Module}, ::String) at /home/joto/.julia/packages/CodeTools/xGemk/src/utils.jl:30
 [53] withpath at /home/joto/.julia/packages/Atom/cR6bU/src/eval.jl:46 [inlined]
 [54] #125 at /home/joto/.julia/packages/Atom/cR6bU/src/eval.jl:122 [inlined]
 [55] with_logstate(::getfield(Atom, Symbol("##125#130")){String,String,Module}, ::Base.CoreLogging.LogState) at ./logging.jl:395
 [56] #124 at ./logging.jl:491 [inlined]
 [57] hideprompt(::getfield(Atom, Symbol("##124#129")){String,String,Module}) at /home/joto/.julia/packages/Atom/cR6bU/src/repl.jl:75
 [58] macro expansion at /home/joto/.julia/packages/Atom/cR6bU/src/eval.jl:120 [inlined]
 [59] macro expansion at /home/joto/.julia/packages/Media/ItEPc/src/dynamic.jl:24 [inlined]
 [60] (::getfield(Atom, Symbol("##123#128")))(::Dict{String,Any}) at /home/joto/.julia/packages/Atom/cR6bU/src/eval.jl:109
 [61] handlemsg(::Dict{String,Any}, ::Dict{String,Any}) at /home/joto/.julia/packages/Atom/cR6bU/src/comm.jl:164
in expression starting at /home/joto/Projekte/Julia/diffeqtrain/fix1.jl:49

That’s highly unexpected since most optimizers are robust to that (ex: Optim.jl does fine in those cases). I would file an issue and move to a nicer tone :wink:.

Your package version is old. Update it and it should work.

Note that if you’re doing parameter estimation not on nerual networks, the appraoch in DiffEqFlux is bad. Use the stuff in DiffEqParamEstim instead. BFGS is more robust on low dimension spaces

I was not blaming Julia or any package. I seriously thought (with in my limited understanding of backpropagation) that an infinite loss makes no sense to optimize inside a gradient descent framework. I thought that problem would have to be addressed inside another framework like restart conditions (read Global Optimization Strategies).
I am surprised by the fact that other optimizer are robust to that.

I updated it (still on Julia 1.0.4) the problems (both) persists. I get these relevant lines during the update:

  [bcd4f6db] ↑ DelayDiffEq v5.4.1 ⇒ v5.9.1
  [2b5f629d] ↑ DiffEqBase v5.11.1 ⇒ v5.16.5
  [01453d9d] ↑ DiffEqDiffTools v0.12.0 ⇒ v0.14.0
  [aae7a2af] ↑ DiffEqFlux v0.5.0 ⇒ v0.6.1
  [055956cb] ↑ DiffEqPhysics v3.1.0 ⇒ v3.2.0
  [41bf760c] ↑ DiffEqSensitivity v3.2.4 ⇒ v3.3.1

I was not aware. I thought that parameter estimation is the prime discipline of DiffEqFlux, together with support for the exotic Neural Ordinary Differential Equations type. I found the ability to add layers which create the weights for the equation curious but ultimately nonsensical. I was wrong, this seems to be the primary purpose of DiffEqFlux.
Now i am curious:

  • What applications does “neural networks setting parameters to get solution after some time to feed back in to neural networks”(i am obviously lacking a term for this idea) have? (Where does it work well)
  • Are there plans to unite DiffEqParamEstim and DiffEqFlux and does that question even make sense?

Thank you for the patience.

Methods which mix in a line search like BFGS or trust region methods can discard points with an Inf. Since ADAM is inherently stochastic, it could have functionality to discard Infs and try a different randomized points to try and not hit the bad area. Machine learning problems generally don’t have this bifurcation behavior so it doesn’t show up as much, so I think it was just overlooked. However, we know from our tests on problems with bifurcations that Optim, BlackBoxOptim, etc. can handle it.

We are using it for discovery of the dynamical equations. Also, we are using it for prediction with misspecified models. My JuliaCon talk will go into a bunch of different ways we’re making use of the connection.

There are plans, but it’s structurally difficult. All optimizers other than Flux want you to pass in an array p, and so the cost functions were written for that. Flux has implicit parameters from its Params, and so the loss functions look different. We are exploring the use cases of multiple shooting and collocations on neural networks, so we are getting similar code to work, but it’s hard to keep the same structure. So I don’t know, it may seem easier to have them be the same, but we haven’t found a good strategy yet. :man_shrugging: We’re working on it.

1 Like

I wait then.

Love to hear that.

I think i stop procrastinating about now. Get some sleep and study more since i have an exam ~30 hours. I hope that one day (more like in 2-4 years) i can contribute to Julia in a more mayor way.

General question: It is bad to stay on Julia 1.0.* if the next release is out?

I opened a lot of bug reports in projects you are involved in. Sorry for that but they came out of this. You do great work and your blog post are good read. I just hope DiffEqParamEstim gets simpler to use.

Great! Bug reports always help.

That talk is available now.

It cleared my confusion about the use case of DiffEqFlux