Autodiff removing singularity

I have a function whose derivative isn’t defined at the origin, but is known ahead of time when approaching from x>0, which is the limit I want. How can I add a special case to the ForwardDiff dual such that it uses this value at the origin but uses the default process elsewhere?

I am sure this has been asked before, but I m must be using the wrong key words

Something like this for the absolute value? Inspired by the ForwardDiff docs

julia> using ForwardDiff: Dual, derivative, value, partials

julia> myabs(x::Real) = abs(x)
myabs (generic function with 1 method)

julia> function myabs(d::Dual{T}) where {T}
           x, Δx = value(d), partials(d)
           slope = ifelse(x <= zero(x), -one(x), one(x))  # note the <=
           return Dual{T}(x, slope * Δx) 
       end
myabs (generic function with 2 methods)

julia> derivative(abs, 0)
1

julia> derivative(myabs, 0)
-1

Note that in general, automatic differentiation applies to programs, not functions, and so the behavior at nondifferentiable points is implementation-dependent.
See this article for details:

A mathematical model for automatic differentiation in machine learning, Bolte & Pauwels (2020)

1 Like

Thanks! Your code defines the derivative everywhere, but I only want to over-write the normal behavior at the origin. I guess I could try…

_myfun(x) = abs(x) # real function is more complicated
myfun(x) = _myfun(x)
myfun(d::Dual{T}) where {T} = ifelse(value(d) == zero(value(d)), Dual{T}(value(d), 2pi * partials(d)), _myfun(d))

If I don’t wrap the function, I don’t see what I could call in the “else” part, but I guess this works…

1 Like

Oh no, another wrinkle! My real function has other arguments. When I call derivative, I make an anonymous function. How am I supposed to handle that?

myfun(x,y) = abs(x-y) # but harder
derivative(x->myfun(x,y),y) # want to fix behavior at x==y

I guess, I’ll need to work backwards through the function chain until I get to the actual singularity…

1 Like

Guess you have to overload myfun for both inputs being Dual

Just for future reference. My real function had a vector input x and additional arguments such as a reference location y and direction n. The basic solution method is the same, but I had to handle the partials of d in a way general enough to handle both gradient(x->ϕ(x,y,n),x) and derivative(t->ϕ(x+t*n,y,n),0)-type calls.

_ϕ(x,y,n) = # iterative vector function with scalar output
ϕ(x,y,n) = _ϕ(x,y,n)
function ϕ(d::Vector{<:Dual{Tag}},y,n) where Tag
    x,Δx = value.(d),stack(partials.(d))
    ifelse(x == y, Dual{Tag}(_ϕ(x,y,n),2π*Δx*n...), _ϕ(d,y,n))
end