(Insanely-) Higher Order Derivatives

What’s the recommended way to obtain insanely high order derivatives of a complex valued (terribly complicated) function in Julia? Presently, I’m using TaylorSeries and it works fantastically.

However, I need to compute derivatives of the order of 500-1000 and quickly run into precision issues. I am using BigFloat and have setprecision(BigFloat,300,base=10). If I increase precision significantly then the computation time increases many-fold.

What would you recommend me to do in this situation? Thanks!

3 Likes

You can use Cauchys integral formula with an optimal choice of contour:

I’m not aware of a Julia implementation but it wouldn’t be too hard to put together (Folkmar is a Julia user so it’s possible it exists somewhere)

10 Likes

thanks, would you be able to comment that if I implement this method then Julia would be able to store the value of the derivatives? What I mean is, even 100! is of the order of 10^157 - so I’m not sure how to store 1000! (maybe I should have also included this into the original question)

I don’t see the connection but what people normally do is work with loggamma. That is, if one wants to evaluate x/factorial(n) instead evaluate exp(log(x) - loggamma(n+1))

My apologies - I realize now that I didn’t express my question properly and let me try to amend that.

So the issue is that if I use TaylorSeries.jl then it expands my terribly complicated function (of one Taylor1 variable t and some parameters) into powers of t - of course, I need to have a cut-off on the expansion and this introduces serious error in the value of the function (and naturally it’s derivatives) for specific choices of parameters and the expansion is only valid for very very low values of t - this is not desirable for me. The only way to fix this is to include higher powers of t and the computation time increases very quickly with this.

So I’m looking for ways to do insanely high derivatives without the Taylor expansion in Julia. Does that make things more clear?

I think @dlfivefifty is suggesting exactly that with the reference. Specifically, if I followed correctly, I think he is proposing this algorithm from the paper:

That is directly computing the nth derivative (at 0), not e.g. the nth term in a Taylor series.

7 Likes

presumably the connection is that there’s a prefactor of n! in the cauchy integral formula for the nth order derivative.

2 Likes

If your goal is to approximate a complicated function, are you open to methods besides Taylor approximations? Extremely high order terms in a Taylor expansion are likely to run into the kinds of problems you are seeing. Is there a specific reason that you need to use Taylor expansion for your application?

Some alternatives might be any kind of regression-fit model, or a basis-spline approximation to your function, or a neural net.

4 Likes

I am also curious what the context is here. I have never heard of someone needing the 1,000th derivative of a function.

10 Likes

I’m open to anything that gets the job done quickly. I’ll check those alternatives, thanks!

two words : “Theoretical Physics” :joy:

thanks, will check this out!

I can think of a reason. Let \{c_i\}_{i \in \mathbb N} be an integer sequence and one can consider the exponential generating function f(x) = \sum_{q \in \mathbb N} c_q \frac{x^q}{q!} \in \mathbb{Z}[[x]]. This form is often easier to work with. Then one has \frac{\mathrm{d}^n}{\mathrm{d}x^n}f(x)\ |_{x = 0} = c_n.

(Not to say that this is the most efficient way to get it)

In theory, you should be able to use Diffractor.
We don’t have a public interface for this hooked up yet.
and I have never tried something that high.
This would look something like

using Diffractor
using Diffractor: ∂☆, ZeroBundle, TaylorBundle
using Diffractor: bundle, first_partial, TaylorTangentIndex

partials = ntuple(i->BigFloat(i==1000),  1000)
tbun = ∂☆{1000}()(ZeroBundle{1000}(my_function), TaylorBundle{1000}(big"1.1", partials))
d1000 = tbun[TaylorTangentIndex(1)]

However this currently doesn’t work, due to a bug.
It will also definately spend a lot of time compiling, but hopefully less time than TaylorSeries.jl

I think though it is better to find a nicer way to do this with nicer maths.
though taylor series is kind already the nicer thing…

1 Like

Definitely the cauchy approach is going to be easier than trying to differentiate 1000 times

The basics of the math is here:

I don’t have access to that reference above from @dlfivefifty but it seems to have a numerically stable way to calculate the contour integral.

I’m not sure why you wouldn’t just use quadgk instead of their trapezoid sum though. There may be a reason, but it’s not obvious to me.

If I’m right, to start with something like:

using QuadGK,SpecialFunctions
f(z) = # your function here

function contourdiff(f,z0,r,n)
   
   res,prec = quadgk(t -> begin z = r * (cos(t)+im*sin(t)) + z0; return f(z)/(z-z0)^(n+1) end, 0,2pi)
   return (exp(log(res) + loggamma(n+1))/2/pi/im)
end

I’m just spitballing this, so check I haven’t done anything dumb. but basically contourdiff(f,z,r,n) will calculate the nth derivative of f at z using a circular contour of radius r around z with Julia’s default Gauss-Kronrod quadrature.

You could vary the r and try to find a stable result if necessary. If it doesn’t work, perhaps you need to read the specifics of that paper.

2 Likes

The integrand is analytic so trapezium rule converges exponentially fast. Using QuadGK will be much much slower

4 Likes

In practice, it’s not so simple, especially at ordinary error tolerances (e.g. rtol=1e-12).

  • if your integrand is sharply peaked (e.g. you have a pole close to the contour) then h-adaptive integration can often reach a fixed tolerance much faster than a trapezoidal rule, because it is able to adaptively place more quadrature points close to the peak (whereas a uniform-spacing trapezoidal rule won’t start to converge exponentially until the resolution everywhere is high).

  • I often observe h-adaptive quadrature with a Gauss–Kronrod rule (of reasonable order) to have convergence that appears exponential for smooth functions until you go to very high precisions.

For example, consider the function f(z) = 1/(z-0.979-0.198im), which has a pole just inside the unit circle, and suppose we want the contour integral of f around the unit circle: \oint f(z) dz = \int_0^{2\pi} f(e^{i\phi}) i e^{i\phi} d\phi. By the Cauchy residue theorem, the exact integral is 2\pi i. Let us compare quadgk vs the trapezoidal rule (equivalent to a Riemann sum for periodic functions), plotting the error vs. the number of evaluations (limited by maxevals=N in quadgk). I find:
image
Note that there is not much noticeable difference in the rate of convergence once the two methods start converging — they both seem to drop almost exponentially fast (nearly straight lines if you switch to a semilog scale) once you get sufficiently many points, until it reaches the limits imposed by machine precision. However, adaptive quadrature (quadgk) reaches this fast-converging regime much earlier than the trapezoidal/Riemann quadrature method due to the latter’s uniform spatial resolution.

Now, if you go to much higher precision, e.g. BigFloat with setprecision(512), then you will notice that asymptotically quadgk is eventually converging more slowly than the trapezoidal rule — you only get power-law rather than exponential convergence. On the other hand, the QuadGK manual advises you to increase the order of the rule in rough proportion to the precision—the default order is 7, so with 512 bits of precision you should use about 8x the order, say order=55. In this case, the convergence rate is again nearly indistinguishable from the trapezoidal rule until you hit the 1e-154 precision limit, and again the adaptive method hits the “fast converging regime” much faster:
image

25 Likes

I’m still not sure exactly what your end goal is, but the Arb library has very good support for computing high order Taylor expansions at high precision. It can be used in Julia through Arblib.jl. I don’t know what function you want to compute the Taylor series of so I can’t do a proper comparison but below is some examples of computing extremely high (order 10000) degrees Taylor expansions.

julia> using Arblib

julia> f(x) = sqrt(1 + sin(x)^2)
f (generic function with 1 method)

julia> x = ArbSeries((0.1, 1), degree = 10000);

julia> @time f(x)[end]
  4.550465 seconds (627.35 k allocations: 1.312 GiB, 0.15% compilation time)
[-1.121381426722549659675060701548094882656285171e+514 +/- 3.31e+468]

Note that it compute errors bounds for the values, so the result is guaranteed to lie on the given ball. It can also do complex expansions and has good support for many special functions. For example computing 10000 terms of the expansion of the erf function.

julia> using Arblib, SpecialFunctions

julia> x = AcbSeries((0.1 + 0.2im, 1), degree = 10000);

julia> @time erf(x)[end]
  0.013758 seconds (40.04 k allocations: 4.581 MiB)
[1.83867928965917293650933334372996992862078316238787160730402039443837e-16316 +/- 2.98e-16385] + [-4.529141059046617219579172857870922200590190434569457278410676338456e-16318 +/- 2.90e-16385]im

Of course, if you only need the derivative for specific orders then the contour integration discussed above will be faster in general!

You mention that your expansions are only valid for very very low values of t and that you want to have larger expansions to make them valid for larger ts. Could it be that the radius of convergence is actually very small? In that case adding more terms to the expansion is not going to help.

3 Likes

If you need a huge number of terms, Taylor expansions are likely to be the wrong tool. After people take first-year calculus and learn about Taylor expansions, many get the false impression that Taylor expansions are the tool to approximate complicated functions, when in practice this is rarely the case except for small domains where the Taylor series converges rapidly.

For example, if you look at special-functions implementations (erf, trigonometric, Bessel, etc.) they almost never use Taylor expansions except in small regions near roots of the function. It’s more common to do things like continued-fraction expansions, asymptotic series, minimax polynomial/rational fits, Chebyshev polynomials, and so forth.

But it’s hard to say more without knowing more context. You say you have a “terribly complicated” function f(x) (of a single variable x \in \mathbb{R}, or \in \mathbb{C}, or …?), which is presumably analytic since you believe you have a Taylor series, and it sounds like you have a Julia function that can compute f(x) for any x (but maybe it’s too slow so you want to replace it with something simpler like a polynomial?). You are trying to approximate it in what domain x \in \mbox{?} and to what accuracy?

14 Likes

The issue though is that I believe Bornemann chooses the contour by optimising a condition number coming from the trapezium rule. I don’t know if you can do this with QuadGK.