I have written an algorithm that solves
for y given x, a_i, z_i. It uses Newton’s method, which is the trivial part, the difficulty was getting a cheap but reasonable initial guess (I use an envelope of linear tangents, it works nicely). Assume that it is implemented in
solver(a::SVector, z::SVector, x::Real, y::Real = initial_guess(a, z, x))
where initial_guess
is differentiable with AD.
I would like solver
to work nicely with AD, especially ForwardDiff.jl and if feasible Enzyme.jl. “Nicely” in this context means that the partials do not need to propagate through the Newton solver and it avoids overflow — see below.
An added wrinkle is that derivatives have the form
and
The problem with this is that a naive implicit algorithm overflows if an a_i + z_i y is even moderately large, which can easily happen.
So, what’s my best bet of hooking this into the a generic AD framework that supports automatic differentiation via a
, z
, and y
and allows me to handle the overflow issue? (which is trivial, you just subtract the largest, but it needs to be handled nevertheless).