Autodiff removing singularity

I have a function whose derivative isn’t defined at the origin, but is known ahead of time when approaching from x>0, which is the limit I want. How can I add a special case to the ForwardDiff dual such that it uses this value at the origin but uses the default process elsewhere?

I am sure this has been asked before, but I m must be using the wrong key words

Something like this for the absolute value? Inspired by the ForwardDiff docs

``````julia> using ForwardDiff: Dual, derivative, value, partials

julia> myabs(x::Real) = abs(x)
myabs (generic function with 1 method)

julia> function myabs(d::Dual{T}) where {T}
x, Δx = value(d), partials(d)
slope = ifelse(x <= zero(x), -one(x), one(x))  # note the <=
return Dual{T}(x, slope * Δx)
end
myabs (generic function with 2 methods)

julia> derivative(abs, 0)
1

julia> derivative(myabs, 0)
-1
``````

Note that in general, automatic differentiation applies to programs, not functions, and so the behavior at nondifferentiable points is implementation-dependent.

A mathematical model for automatic differentiation in machine learning, Bolte & Pauwels (2020)

1 Like

Thanks! Your code defines the derivative everywhere, but I only want to over-write the normal behavior at the origin. I guess I could try…

``````_myfun(x) = abs(x) # real function is more complicated
myfun(x) = _myfun(x)
myfun(d::Dual{T}) where {T} = ifelse(value(d) == zero(value(d)), Dual{T}(value(d), 2pi * partials(d)), _myfun(d))
``````

If I don’t wrap the function, I don’t see what I could call in the “else” part, but I guess this works…

1 Like

Oh no, another wrinkle! My real function has other arguments. When I call `derivative`, I make an anonymous function. How am I supposed to handle that?

``````myfun(x,y) = abs(x-y) # but harder
derivative(x->myfun(x,y),y) # want to fix behavior at x==y
``````

I guess, I’ll need to work backwards through the function chain until I get to the actual singularity…

1 Like

Guess you have to overload `myfun` for both inputs being `Dual`

Just for future reference. My real function had a vector input `x` and additional arguments such as a reference location `y` and direction `n`. The basic solution method is the same, but I had to handle the partials of `d` in a way general enough to handle both `gradient(x->ϕ(x,y,n),x)` and `derivative(t->ϕ(x+t*n,y,n),0)`-type calls.

``````_ϕ(x,y,n) = # iterative vector function with scalar output
ϕ(x,y,n) = _ϕ(x,y,n)
function ϕ(d::Vector{<:Dual{Tag}},y,n) where Tag
x,Δx = value.(d),stack(partials.(d))
ifelse(x == y, Dual{Tag}(_ϕ(x,y,n),2π*Δx*n...), _ϕ(d,y,n))
end
``````