To close out this conversation from the Roots end, the newest version, closing the issue opened based on this thread, provides a different solution:
With ForwardDiff it is important – due to how types propagate – to start with a type matching the parameter:
using Roots, ForwardDiff, Test
f( x, a ) = log( x ) - a
fₓ( x, a ) = ForwardDiff.derivative( t -> f( t, a ), x )
mexp(a) = find_zero((f, fₓ), 1, Roots.Newton(), a) # x0 = 1 -- not one(a)!
@test mexp(log(π)) ≈ π
@test_throws MethodError ForwardDiff.derivative(mexp, 1)
mexp_(a) = find_zero((f, fₓ), one(a), Roots.Newton(), a)
@test ForwardDiff.derivative(mexp_, 1) ≈ exp(1)
# using the closure style you started with
mexp1(a) = find_zero((x -> f(x,a), x->fₓ(x,a)), one(a), Roots.Newton())
@test ForwardDiff.derivative(mexp1, 1) ≈ exp(1)
The adjoint method, mentioned by @stevengj, can avoid the extra effort of pushing the AD machinery through the algorithm:
a = log(π)
xᵅ = find_zero((f,fₓ), 1, Roots.Newton(), a)
fx = ForwardDiff.derivative(x -> f(x, a), xᵅ)
fp = ForwardDiff.derivative(a -> f(xᵅ, a), a) # gradient if `a` is a vector
@test - fp / fx ≈ exp(a)
This led to a new frule and rrule in Roots so that AD programs should be able to differentiate directly (with the parameter passed in to find_zero, not within a closure):
using Zygote # uses rrule from Roots
@test first(Zygote.gradient(mexp, 1)) ≈ exp(1)