Root finding / NonLinear problem

ufechner7 · April 9, 2024, 11:03am

Question:
I am trying to solve a root-finding problem on a vector of 14 elements, but each function evaluation needs about 7 seconds.

What is the best solver for this?

And is there a way to declare a valid range of the parameters of my function?
Not all parameter combinations give a feasible results. How to best signal that
to the solver?
My current code (not really working, at least not or only very slowly converging):

    u0 = KiteControllers.load_corr()
    prob = NonlinearProblem(residual, u0)
    sol = solve(prob, RobustMultiNewton(; autodiff = AutoFiniteDiff()); 
                abstol=1.0, reltol=0.05, maxiters=20, verbose=true)

And maxiters is not respected.

This works, but it feels like writing my own solver:

    initial = KiteControllers.load_corr()
    last_norm=1000
    for i in 1:40
        last_initial = deepcopy(initial)
        res = residual(initial)
        println("i: $(i), norm: $(norm(res))")
        common_size=min(length(initial), length(res))
        for i=1:common_size
            if norm(res) > 5
                initial[i] += res[i]
            elseif norm(res) > 2
                initial[i] += 0.3*res[i]
            else
                initial[i] += 0.15*res[i]
            end
        end
        if norm(res) < 1.0
            println("Converged successfully!")
            break
        end
        if last_norm < 0.7*norm(res) 
            KiteControllers.save_corr(last_initial)
            println("Convergance failed!")
            println("Last norm: $last_norm")
            break
        end
        last_norm=norm(res)
    end
    initial

In 40 iterations I get a solution with norm(res) < 1.0.

gdalle · April 9, 2024, 11:15am

Is your function differentiable?

ufechner7 · April 9, 2024, 11:19am

Well, the function runs a flight simulation, the return value is the deviation of the desired flight path at 14 points… Furthermore, the return value might contain some noise…

So probably not easily differentiable, even though mostly it is true that increasing the input also increases the output… But there is some non-linear coupling between the channels (elements of the vectors) …

gdalle · April 9, 2024, 11:28am

I’m asking because when you use FiniteDiff as the autodiff backend, each gradient evaluation costs 14 function calls. Whereas if you are able to use a reverse mod autodiff backend like Enzyme, it will go down to 1

ChrisRackauckas · April 9, 2024, 11:58am

What do you mean? It’s respected within each solve of the multi-Newton.

Doing a transformation to re-parameterize that to the real line is usually helpful.

What do you mean by noise here? That’s a big red flag for a Newton method.

No, reverse mode does not help to compute a square Jacobian.

gdalle · April 9, 2024, 12:02pm

My bad, I thought we were looking for the zero of a scalar-valued function

ufechner7 · April 9, 2024, 12:21pm

Of cause automatic differentiation is not possible if you are investigating how a hybrid system (continues time system, controlled by a digital controller) reacts to changed control settings…

ChrisRackauckas · April 9, 2024, 12:24pm

No, our automatic differentiation works on hybrid systems, both forward and reverse. See for example:

We wrote some details on this as well:

That’s for reverse mode. Forward mode AD of hybrid systems worked of course since 2017 since dosing is a classic example of that an Pumas was built around forward mode AD.

ufechner7 · April 9, 2024, 12:25pm

Well, maybe that works if you use ModellingToolkit, but I am not using it for this project…

This is what we are talking about: KiteControllers.jl/examples/learning.jl at main · aenarete/KiteControllers.jl · GitHub

ChrisRackauckas · April 9, 2024, 12:32pm

The examples I shared do not use ModelingToolkit. Any ODE where f, condition, and affect! are differentiable should work.

When you try ForwardDiff, what do you see?

ufechner7 · April 9, 2024, 12:56pm

But this cannot work… I am not working on an ODE… I run a simulation, save the result as log file and then extract from the log file the position of curtain points of the flight path…

Just to give you an idea how the uncorrected flight path looks like:

Pseudocode:

log=[]
for t in 0:dt:t_end
    KiteModels.next_step!(kps4, integrator, v_ro=v_ro, dt=dt)
    sys_state = KiteModels.SysState(kps4)
    push!(log, sys_state)
    # run the discrete time controller
    change_steering!(kps4)      
end
# extract the deviation from the desired flight path
residual = observe(log)

I am not using any callbacks…

Using the residual, I want to correct the parameters of the discrete time controller.

PeterSimon · April 9, 2024, 2:05pm

IIUC you’re trying to minimize the norm of the residual. You could use Nomad.jl with the norm of the residual as the objective function. Nomad allows you to tell it when it generates an infeasible point.

ufechner7 · April 9, 2024, 6:18pm

I have this code now:

function train(; max_iter=40, norm_tol=1.0)
    local corr_vec
    try
        log = load_log("uncorrected")
        ob = KiteObserver()
        observe!(ob, log)
        corr_vec=ob.corr_vec
    catch
        corr_vec=residual()
    end
    KiteControllers.save_corr(corr_vec)
    initial = KiteControllers.load_corr()
    last_norm=1000
    best_corr_vec = deepcopy(initial)
    best_norm = norm(best_corr_vec)
    j = 0
    for i in 1:max_iter
        res = residual(initial)
        println("i: $(i), norm: $(norm(res))")
        common_size=min(length(initial), length(res))
        for i = 1:common_size
            if best_norm > 5
                initial[i] += res[i]
            elseif best_norm > 2
                initial[i] += 0.5*res[i]
            else
                initial[i] += 0.25*best_norm*res[i]
            end
        end
        if norm(res) < norm_tol
            println("Converged successfully!")
            break
        end
        if best_norm > norm(res)
            best_norm = norm(res)
            best_corr_vec = deepcopy(initial)
            j = 0
            println("j: $(j), best_norm= $best_norm")
        else
            j+=1
            println("j: $j")
        end
        if j > 4
            println("Convergance failed!")
            println("Best norm: $best_norm")
            break
        end
        last_norm=norm(res)
    end
    KiteControllers.save_corr(best_corr_vec)
    best_corr_vec
end

It works pretty well, the norm of the error goes down to less then one degree in less than 20 iterations…

I think there is no chance to get such a fast convergence with any generic solver. I think I am exploiting here knowledge about the system that generic solvers don’t have. Limitation: I cannot achieve an error of less than 0.88 degrees… But that is good enough for my use case.

I still need to find out how robust this approach is with respect to changes of the model, e.g. changed wind speed or changed size of the kite. I will report back as soon as I found out…

Topic		Replies	Views
Multidimensional non-linear root search? Optimization (Mathematical) nonlinear , rootfinding	3	393	November 3, 2022
Help finding multiple roots to nonlinear equation - IntervalRootFinding not working Numerics	13	241	September 25, 2024
Root finding for (possibly) complex valued functions with Nonlinearsolve.jl Optimization (Mathematical) question , roots , numerics	2	609	April 22, 2023
Better way to name or inherit parameters with NonlinearSolve.jl? Numerics nonlinearsolve , parameters	3	290	February 14, 2024
Not getting roots using "IntervalRootFinding.jl" package in Pluto.jl notebook General Usage question	3	189	January 27, 2023

Root finding / NonLinear problem

Related topics