DiffEq: Flux.train crashes mid-training and requests larger maxiter

Hello everyone,

I’m trying to optimize the parameters in a nonlinear differential equation towards some known parameters using the ADAMs optimizer (benchmarking different optimizers for later usage). For some initial conditions, the solver is interrupted in the middle of the training process and returns below warning:

 Warning: Interrupted. Larger maxiters is needed.
└ @ DiffEqBase \.julia\packages\DiffEqBase\3iigH\src\integrator_interface.jl:329

The appearance of this warning seems to depend on the learning rate of the ADAM optimizer but does not depend on the amount of iterations specified in my iterator object. The warning appears later/does not appear at all when I reduce the learning rate. Where does this error come from?

Below you find the part of my code dealing with the optimization.


function predict_rd() 
  sol = solve(prob,Tsit5(),p=p,saveat=0.005)
  return sol
end

function loss_rd() # loss function
  sol = predict_rd()
  sol = sol[eval_index,:]
  loss = sum(abs2,sol-ref)
  if loss <= tol
    display("Converged below tolerance - stopping")
    Flux.stop()
  end
  return loss
end

t = 0:0.005:200.0
data = Iterators.repeated((), 500)
opt = ADAM(0.1)

cb = function() #callback function to observe training
  display(loss_rd())
  return
end

# Display the ODE with the initial parameter values.
curr_sol = solve(remake(prob,p=p),Tsit5(),saveat=0.005)
display(plot(t, curr_sol[eval_index,:], ylim=(-1.25, 1.25), label = "Model"))
display(plot!(t,ref, label = "Target"))
display(plot!(t, og_sweep.(t), label = "Sweep"))

# Train
Flux.train!(loss_rd, params, data, opt, cb = cb)

# Plot results
display(p)
curr_sol = solve(remake(prob,p=p),Tsit5(),saveat=0.005)
display(plot(t, curr_sol[eval_index,:], ylim=(-1.25, 1.25), label = "Model"))
display(plot!(t,ref, label = "Target"))

For your ODE, you can define parameters where your ODE diverges or becomes too stiff for the method you chose. Try a method for stiff equations (KenCarp47(autodiff=false)) or try reducing the learning rate. If you’re still seeing it, take the parameters at the end of the optimization, solve the ODE, and then plot/analyze the output. Find out why it’s diverging at those parameters. That will tell you what you need to know about the system.

Thanks a lot! I suspected that something was diverging so I plotted the parameters of the ode over the iterations performed. Shortly before the crash, one parameter becomes negative… This parameter is a non-dimensional coefficient and thus should never become negative in the first place… Is there some way to limit the parameters in my search or do I need to implement this manually into the loss function/switch to another optimizer?

One thing you can do is log-transform and use the exp or square of that parameter in the cost function (so square it before putting it into the problem). That will work for any optimizer. Or you can use the Fminbox stuff from Optim to do box-constrained BFGS.

1 Like