Automatic differentiation of function with special points

I am using ForwardDiff.jl to compute derivatives of functions which often contain functions of the form:

f(x,y) = x/[exp(x/y)-1]

which one can show using L’Hôpital’s rule has the value f(0,y) = y. One way to construct such a function is:

f(x, y) = (x == 0.0 ? y : x/(exp(x/y)-1) )

This is continuous with respect to x, but ForwardDiff understandably gives an x-derivative of zero at x=0. The actual derivative at x=0 should be f’(0,y) = -1/2. If I only required the first derivative, then I could put in a workaround to take this into account (e.g. x == 0.0 ? y-x/2 : ... ). However, I also intend to use the second and third derivatives wrt x and I’d like to know how to generalise to similar functions.

Is there a way to do this by defining custom derivatives? Or put another way, why does ForwardDiff work on the sinc function when it also contains a special point similar to this at x=0?

ForwardDiff should ignore branches like x == 0.0 as these often lead to wrong answers (like here). For various reasons this is only on master branch right now. It leads to NaN instead of a wrong answer.

One way around this is to splice in a Taylor expansion on a finite region. A very quick attempt (without thinking carefully about how it will behave at very small nonzero x) looks like this:

julia> f1(x, y) = x/(exp(x/y)-1);
julia> f2(x, y) = x == 0.0 ? y : x/(exp(x/y)-1);
julia> f3(x, y) = abs2(x)>eps() ? x/(exp(x/y)-1) : y - x/2 + x^2/(12*y) - x^4/(720*y^3);

julia> using ForwardDiff  # master, i.e. v0.11

julia> ForwardDiff.derivative(x -> f1(x, 1), 0)
NaN

julia> ForwardDiff.derivative(x -> f2(x, 1), 0)  # was 0 on v0.10
NaN

julia> ForwardDiff.derivative(x -> f3(x, 1), 0)
-0.5

This has an explicit rule in DiffRules.jl. Note that ForwardDiff reads these rules once while loading, so you can’t easily add rules for your functions this way. You can (as mentioned) add methods like sinc(x::Dual) :

julia> ForwardDiff.derivative(sinc, 0.0)  # this calls cosc
-0.0

julia> @which sinc(ForwardDiff.Dual(0, 1))
sinc(d::Dual{T}) where T
     @ ForwardDiff ~/.julia/dev/ForwardDiff/src/dual.jl:238

julia> sinc1(x) = sin(x)/x;
julia> sinc2(x) = x==0 ? one(x) : sin(x)/x;
julia> sinc3(x) = abs2(x) > eps() ? sin(x)/x : 1 - x^2/6 + x^4/120;

julia> ForwardDiff.derivative(sinc1, 0.0)
NaN

julia> ForwardDiff.derivative(sinc2, 0.0)  # was 0.0 on ForwardDiff v0.10
NaN

julia> ForwardDiff.derivative(sinc3, 0.0)  # thanks to taylor series
0.0

julia> using ForwardDiff: Dual, value, partials

julia> sinc4(x::Dual{Z}) where Z = Dual{Z}(value(x), cosc(value(x)) * partials(x));

julia> ForwardDiff.derivative(sinc4, 0.0)
-0.0
2 Likes

This is mostly unrelated to the ForwardDiff question, but for things like this expm1 is very much your friend.

1 Like

This is interesting, thank you I have used a Taylor series approach to resolve this issue for now. Is there any particular advantage in defining function(x::Dual{Z}) where ... as opposed to/in addition to function(x) = abs(x) > eps() ? ...? Ultimately the definition of cosc(x) still requires a value to be defined at x=0, right?

Indeed it is, thanks for reminding me of this.