Question:
I am trying to solve a root-finding problem on a vector of 14 elements, but each function evaluation needs about 7 seconds.
What is the best solver for this?
And is there a way to declare a valid range of the parameters of my function?
Not all parameter combinations give a feasible results. How to best signal that
to the solver?
My current code (not really working, at least not or only very slowly converging):
This works, but it feels like writing my own solver:
initial = KiteControllers.load_corr()
last_norm=1000
for i in 1:40
last_initial = deepcopy(initial)
res = residual(initial)
println("i: $(i), norm: $(norm(res))")
common_size=min(length(initial), length(res))
for i=1:common_size
if norm(res) > 5
initial[i] += res[i]
elseif norm(res) > 2
initial[i] += 0.3*res[i]
else
initial[i] += 0.15*res[i]
end
end
if norm(res) < 1.0
println("Converged successfully!")
break
end
if last_norm < 0.7*norm(res)
KiteControllers.save_corr(last_initial)
println("Convergance failed!")
println("Last norm: $last_norm")
break
end
last_norm=norm(res)
end
initial
In 40 iterations I get a solution with norm(res) < 1.0.
Well, the function runs a flight simulation, the return value is the deviation of the desired flight path at 14 points… Furthermore, the return value might contain some noise…
So probably not easily differentiable, even though mostly it is true that increasing the input also increases the output… But there is some non-linear coupling between the channels (elements of the vectors) …
I’m asking because when you use FiniteDiff as the autodiff backend, each gradient evaluation costs 14 function calls. Whereas if you are able to use a reverse mod autodiff backend like Enzyme, it will go down to 1
Of cause automatic differentiation is not possible if you are investigating how a hybrid system (continues time system, controlled by a digital controller) reacts to changed control settings…
No, our automatic differentiation works on hybrid systems, both forward and reverse. See for example:
We wrote some details on this as well:
That’s for reverse mode. Forward mode AD of hybrid systems worked of course since 2017 since dosing is a classic example of that an Pumas was built around forward mode AD.
But this cannot work… I am not working on an ODE… I run a simulation, save the result as log file and then extract from the log file the position of curtain points of the flight path…
Just to give you an idea how the uncorrected flight path looks like:
log=[]
for t in 0:dt:t_end
KiteModels.next_step!(kps4, integrator, v_ro=v_ro, dt=dt)
sys_state = KiteModels.SysState(kps4)
push!(log, sys_state)
# run the discrete time controller
change_steering!(kps4)
end
# extract the deviation from the desired flight path
residual = observe(log)
I am not using any callbacks…
Using the residual, I want to correct the parameters of the discrete time controller.
IIUC you’re trying to minimize the norm of the residual. You could use Nomad.jl with the norm of the residual as the objective function. Nomad allows you to tell it when it generates an infeasible point.
function train(; max_iter=40, norm_tol=1.0)
local corr_vec
try
log = load_log("uncorrected")
ob = KiteObserver()
observe!(ob, log)
corr_vec=ob.corr_vec
catch
corr_vec=residual()
end
KiteControllers.save_corr(corr_vec)
initial = KiteControllers.load_corr()
last_norm=1000
best_corr_vec = deepcopy(initial)
best_norm = norm(best_corr_vec)
j = 0
for i in 1:max_iter
res = residual(initial)
println("i: $(i), norm: $(norm(res))")
common_size=min(length(initial), length(res))
for i = 1:common_size
if best_norm > 5
initial[i] += res[i]
elseif best_norm > 2
initial[i] += 0.5*res[i]
else
initial[i] += 0.25*best_norm*res[i]
end
end
if norm(res) < norm_tol
println("Converged successfully!")
break
end
if best_norm > norm(res)
best_norm = norm(res)
best_corr_vec = deepcopy(initial)
j = 0
println("j: $(j), best_norm= $best_norm")
else
j+=1
println("j: $j")
end
if j > 4
println("Convergance failed!")
println("Best norm: $best_norm")
break
end
last_norm=norm(res)
end
KiteControllers.save_corr(best_corr_vec)
best_corr_vec
end
It works pretty well, the norm of the error goes down to less then one degree in less than 20 iterations…
I think there is no chance to get such a fast convergence with any generic solver. I think I am exploiting here knowledge about the system that generic solvers don’t have. Limitation: I cannot achieve an error of less than 0.88 degrees… But that is good enough for my use case.
I still need to find out how robust this approach is with respect to changes of the model, e.g. changed wind speed or changed size of the kite. I will report back as soon as I found out…