# Root finding / NonLinear problem

Question:
I am trying to solve a root-finding problem on a vector of 14 elements, but each function evaluation needs about 7 seconds.

What is the best solver for this?

And is there a way to declare a valid range of the parameters of my function?
Not all parameter combinations give a feasible results. How to best signal that
to the solver?
My current code (not really working, at least not or only very slowly converging):

``````    u0 = KiteControllers.load_corr()
prob = NonlinearProblem(residual, u0)
sol = solve(prob, RobustMultiNewton(; autodiff = AutoFiniteDiff());
abstol=1.0, reltol=0.05, maxiters=20, verbose=true)
``````

And maxiters is not respected.

This works, but it feels like writing my own solver:

``````    initial = KiteControllers.load_corr()
last_norm=1000
for i in 1:40
last_initial = deepcopy(initial)
res = residual(initial)
println("i: \$(i), norm: \$(norm(res))")
common_size=min(length(initial), length(res))
for i=1:common_size
if norm(res) > 5
initial[i] += res[i]
elseif norm(res) > 2
initial[i] += 0.3*res[i]
else
initial[i] += 0.15*res[i]
end
end
if norm(res) < 1.0
println("Converged successfully!")
break
end
if last_norm < 0.7*norm(res)
KiteControllers.save_corr(last_initial)
println("Convergance failed!")
println("Last norm: \$last_norm")
break
end
last_norm=norm(res)
end
initial
``````

In 40 iterations I get a solution with norm(res) < 1.0.

Well, the function runs a flight simulation, the return value is the deviation of the desired flight path at 14 pointsâ€¦ Furthermore, the return value might contain some noiseâ€¦

So probably not easily differentiable, even though mostly it is true that increasing the input also increases the outputâ€¦ But there is some non-linear coupling between the channels (elements of the vectors) â€¦

Iâ€™m asking because when you use `FiniteDiff` as the autodiff backend, each gradient evaluation costs 14 function calls. Whereas if you are able to use a reverse mod autodiff backend like `Enzyme`, it will go down to 1

What do you mean? Itâ€™s respected within each solve of the multi-Newton.

Doing a transformation to re-parameterize that to the real line is usually helpful.

What do you mean by noise here? Thatâ€™s a big red flag for a Newton method.

No, reverse mode does not help to compute a square Jacobian.

1 Like

My bad, I thought we were looking for the zero of a scalar-valued function

Of cause automatic differentiation is not possible if you are investigating how a hybrid system (continues time system, controlled by a digital controller) reacts to changed control settingsâ€¦

No, our automatic differentiation works on hybrid systems, both forward and reverse. See for example:

We wrote some details on this as well:

Thatâ€™s for reverse mode. Forward mode AD of hybrid systems worked of course since 2017 since dosing is a classic example of that an Pumas was built around forward mode AD.

Well, maybe that works if you use ModellingToolkit, but I am not using it for this projectâ€¦

This is what we are talking about: KiteControllers.jl/examples/learning.jl at main Â· aenarete/KiteControllers.jl Â· GitHub

The examples I shared do not use ModelingToolkit. Any ODE where `f`, `condition`, and `affect!` are differentiable should work.

When you try ForwardDiff, what do you see?

But this cannot workâ€¦ I am not working on an ODEâ€¦ I run a simulation, save the result as log file and then extract from the log file the position of curtain points of the flight pathâ€¦

Just to give you an idea how the uncorrected flight path looks like:

Pseudocode:

``````log=[]
for t in 0:dt:t_end
KiteModels.next_step!(kps4, integrator, v_ro=v_ro, dt=dt)
sys_state = KiteModels.SysState(kps4)
push!(log, sys_state)
# run the discrete time controller
change_steering!(kps4)
end
# extract the deviation from the desired flight path
residual = observe(log)
``````

I am not using any callbacksâ€¦

Using the residual, I want to correct the parameters of the discrete time controller.

IIUC youâ€™re trying to minimize the norm of the residual. You could use Nomad.jl with the norm of the residual as the objective function. Nomad allows you to tell it when it generates an infeasible point.

I have this code now:

``````function train(; max_iter=40, norm_tol=1.0)
local corr_vec
try
ob = KiteObserver()
observe!(ob, log)
corr_vec=ob.corr_vec
catch
corr_vec=residual()
end
KiteControllers.save_corr(corr_vec)
last_norm=1000
best_corr_vec = deepcopy(initial)
best_norm = norm(best_corr_vec)
j = 0
for i in 1:max_iter
res = residual(initial)
println("i: \$(i), norm: \$(norm(res))")
common_size=min(length(initial), length(res))
for i = 1:common_size
if best_norm > 5
initial[i] += res[i]
elseif best_norm > 2
initial[i] += 0.5*res[i]
else
initial[i] += 0.25*best_norm*res[i]
end
end
if norm(res) < norm_tol
println("Converged successfully!")
break
end
if best_norm > norm(res)
best_norm = norm(res)
best_corr_vec = deepcopy(initial)
j = 0
println("j: \$(j), best_norm= \$best_norm")
else
j+=1
println("j: \$j")
end
if j > 4
println("Convergance failed!")
println("Best norm: \$best_norm")
break
end
last_norm=norm(res)
end
KiteControllers.save_corr(best_corr_vec)
best_corr_vec
end
``````

It works pretty well, the norm of the error goes down to less then one degree in less than 20 iterationsâ€¦

I think there is no chance to get such a fast convergence with any generic solver. I think I am exploiting here knowledge about the system that generic solvers donâ€™t have. Limitation: I cannot achieve an error of less than 0.88 degreesâ€¦ But that is good enough for my use case.

I still need to find out how robust this approach is with respect to changes of the model, e.g. changed wind speed or changed size of the kite. I will report back as soon as I found outâ€¦