I wrote a brief post explaining the “complex step method” that was the first topic in Nick Higham’s JuliaCon 2018 talk. (Using Julia for demonstrations, of course !)
Nice post!
ForwardDiff
requires that the derivative of all functions be registered, otherwise it throws an error.
This isn’t true, actually. The reason it throws an error is because your implementation of lambertw
contains an unnecessary type restriction.
With this one line change, it works with ForwardDiff:
diff --git a/src/LambertW.jl b/src/LambertW.jl
index e6fe2f0..3db0930 100644
--- a/src/LambertW.jl
+++ b/src/LambertW.jl
@@ -71,7 +71,7 @@ end
# This appears to be inferrable with T=Float64 and T=BigFloat, including if x=Inf.
# There is a magic number here. It could be noted, or possibly removed.
# In particular, the fancy initial condition selection does not seem to help speed.
-function lambertw_branch_zero(x::T, maxits)::T where T<:AbstractFloat
+function lambertw_branch_zero(x::T, maxits)::T where T<:Real
isnan(x) && return(NaN)
x == Inf && return Inf # appears to return convert(BigFloat, Inf) for x == BigFloat(Inf)
one_t = one(T)
julia> using LambertW, ForwardDiff
julia> ForwardDiff.derivative(lambertw, 1/2)
0.5204186421068739
Thanks! The post has been amended. (And LambertW
as well.) ForwardDiff
is also twice as fast for the example in the post.
It’s a bit curious that the value computed by ForwardDiff
in the same example differs from that computed from the exact expression by eps(.)
. Not that this is of any practical consequence.
ForwardDiff
requires that the derivative of all functions be registered, otherwise it throws an error.
I mean, the functions on complex numbers also have to be defined, and these also, in some sense, involve the derivatives by thinking about the Taylor series.
Dual numbers are simpler than complex numbers and defining functions on them is simpler. It’s just that for complex numbers someone else usually has done it for you.
I’d also like to point out that the complex step method is fundamentally dangerous to apply to code that wasn’t written with this in mind. It can silently give the wrong answer:
julia> f(x) = [x]'*[1.0]
f (generic function with 1 method)
julia> deriv_cstep(f)(1.0)
-1.0
I think I agree (if I understand). The point is that this work is already done. But, I edited the post significantly. For clarity, I changed “must be encoded” to “is perforce encoded”. I also added comparisons for other use cases. For instance computing derivatives of functions in Python’s mpmath
.
Dual numbers are simpler than complex numbers and defining functions on them is simpler. It’s just that for complex numbers someone else usually has done it for you.
Yes. I added something on the waste involved in using complex arithmetic for this task. I’d like to include a more careful comparison of these methods. But, that will have to wait.
By the way, I removed this erroneous line from my post as soon as I saw @jrevels’ comment. But, it’s good to remind people here how powerful ForwardDiff
combined with generic methods can be. (Or a library using any other type of number, for that matter.)
Complex conjugation is not an analytic function:
julia> [1 + im]'
1×1 LinearAlgebra.Adjoint{Complex{Int64},Array{Complex{Int64},1}}:
1 - 1im
The entire point of my post is why the method only works for complex analytic functions. I have already added a section about non-analytic functions (including a similar silent error). But your point is important. Yes it’s dangerous, for instance if it is used in a library where this restriction may be forgotten or is hard to verify. In fact, the latter point should be added to my post (but, I have to leave it for now!)
Hello guys, I think you may want to see this:
It is about using “multi-complex” step differentiation to obtain accurate higher order derivative, I also made a package to try the relating methods, it turned out the multi complex step differentiation is amazing:
https://github.com/ErikQQY/ComplexDiff.jl
(Just register this package, so you may need to clone from GitHub directly)
pkg> add https://github.com/ErikQQY/ComplexDiff.jl
The second derivative is fine for almost all of the common functions, while the higher order derivatives may have problems🤔