ForwardDiff 3rd (or even higher) order derivatives

differentiation

#1

I am trying to calculate higher derivatives with ForwardDiff, but I get DomainError for the third derivative. I definitely need at least order 7, but up to 13 would be great. Does anybody have an idea, if a) this is supposed to work b) if yes what am I doing wrong. Thanks in advance…

The code is
g0(z) = [z[2]^2*z[1]^3]
g1(z) = ForwardDiff.jacobian(g0,z)
g2(z) = ForwardDiff.jacobian(g1,z)
g3(z) = ForwardDiff.jacobian(g2,z)

x=[ 2 2 ]
g1(x)
1×2 Array{Int64,2}:
48 32

g2(x)
2×2 Array{Int64,2}:
48 48
48 16

g3(x)
ERROR: DomainError:
Stacktrace:
[1] ^ at /home/rs1909/.julia/v0.6/ForwardDiff/src/dual.jl:395 [inlined] (repeats 3 times)
etc…


#2

Use x = [2., 2.] instead (giving you a vector of floats instead of integers). The error you get says:

Cannot raise an integer x to a negative power -n.
Make x a float by adding a zero decimal (e.g. 2.0^-n instead of 2^-n), or write 1/x^n, float(x)^-n, or (x//1)^-n.

Order 13 derivatives will create very complicated nested types, it is possible compilation time / inference time will become an issue but perhaps it will work well.


#3

I suggest you try TaylorSeries.jl if you need high derivatives.


#4

I wondered about that, but it seems to me that I would need to expand about a constant value. I am trying to implement the Faa Di Bruno formula (high order chain rule) and I need to dynamically change where the derivative is calculated.


#5

Thank, this is great help. I should have realised the integer type problem. It works now.


#6

Can you specify a bit more precisely exactly what you need? Here’s a starter example with TaylorSeries.jl (which is a registered package, so just Pkg.add("TaylorSeries") to install):

julia> using TaylorSeries

julia> f(x, y) = cos(x*y) + sin(x - y^2);

julia> xx, yy = set_variables("x y", order=15);

julia> f(xx, yy)
 1.0 + 1.0 x² - 1.0 y² - 0.5 x² y² - 0.16666666666666666 x⁶ + 0.5 x⁴ y² - 0.5 x² y⁴ + 0.16666666666666666 y⁶ + 0.041666666666666664 x⁴ y⁴ + 0.008333333333333333 x¹⁰ - 0.041666666666666664 x⁸ y² + 0.08333333333333333 x⁶ y⁴ - 0.08333333333333333 x⁴ y⁶ + 0.041666666666666664 x² y⁸ - 0.008333333333333333 y¹⁰ - 0.0013888888888888887 x⁶ y⁶ - 0.00019841269841269839 x¹⁴ + 0.0013888888888888885 x¹² y² - 0.004166666666666666 x¹⁰ y⁴ + 0.006944444444444443 x⁸ y⁶ - 0.006944444444444443 x⁶ y⁸ + 0.004166666666666666 x⁴ y¹⁰ - 0.0013888888888888885 x² y¹² + 0.00019841269841269839 y¹⁴ + 𝒪(‖x‖¹⁶)

There are various derivative methods available to extract derivatives, or you can just access the coefficients of this multivariate polynomial.

If you need to expand e.g. around (-3, 4), you can do

f(3+xx, -4+yy)

Here, xx and yy now represent small perturbations around -3 and 4, respectively.


#7

Thanks for the interest, the full problem is the following. I have a differential equation defined in a rather strange way. Let n > 2, and denote the total differentiation with respect to ‘t’ by capital D. My differential equation is

D^(n+1) u(t) = f(t,u(t)) + D^n { g(t,u(t)) },

where u is a vector and t is a scalar, f and g are also vector valued, the same dimension as u.

My biggest problem is that I need to calculate the right-hand side, which involves calculating the n-th derivative of g(t,u(t)). For this I need to use the generalised chain rule. By rewriting the equation into first order form, the values of D^k u(t), 0 <= k <= n become state variables, so in essence I need to calculate various partial derivatives of g(t,u) and combine them with D^k u(t).

The first issue is that I really need the full tensor form of the derivative, because they are used as multi-linear forms, multiplied by various D^k u(t) values. As far as I can tell Taylor expansion assumes that all arguments of the multi linear forms are the same (hence you get a polynomial)

The second issue is that I could not see an example how to parameterise the function by those shifted values, such that the derivative functions do not need to be recompiled every time a different shift is used (which is every call).

I can now see that ForwardDiff will work (in theory) even if it is very slow to compile. I was also thinking of just using computer algebra to do the derivation, but the function ‘g’ is not a nice algebraic expression.

As you can see this is highly non-standard and at the moment I think Julia is my best bet.


#8

Thanks for the explanation. Is ForwardDiff set up to return the nth derivative as a tensor, though?