Custom rule for differentiating through Newton solver using ForwardDiff: works for gradient, fails for hessian

So I profiled the call to D(p), here are the results:

A couple of remarks:

  • There seems to be some type instabilities in the custom forward rule for ImplificFunction
  • I don’t understand why, but when I execute D(p), the root function is called 7 times with the same input vector each time. I guess this is due to this loop here. Is there any way to optimize this?

Full code for reference:


# Nth order polynomial evaluated at x
eval_poly(x, p) = sum(p[i]*x^(i-1) for i in eachindex(p))

# Polynomial root (this is the slow part that should never be called with Dual numbers)
function root(p::AbstractVector{T}) where T <: Real
    f(x) = eval_poly(x, p)
    xᵅ = find_zero(f, 0.)
    return xᵅ
end

function zero_gradient(p, x)
    return [eval_poly(x[1], p)]
end

linear_solver(A, b) = (Matrix(A) \ b, (solved=true,))
implicit = ImplicitFunction(p -> [root(p)], zero_gradient, linear_solver)

@ForwardDiff_frule (f::typeof(implicit))(p::AbstractArray{<:ForwardDiff.Dual}; kwargs...)

D(p) = ForwardDiff.gradient(p -> implicit(p)[1], p)
DD(p) = ForwardDiff.hessian(p -> implicit(p)[1], p)


# Test
p = [0.4595890823651114,
    -0.12268155438606576,
    0.2555413922391766,
    -0.445873981165829,
    0.3981791064025203,
    -0.14840698579393818,
    -0.09402206963799742]

root(p)

@profview @btime D($p)
1 Like