Higher derivatives using ForwardDiff

Hi All,

If I have a function, say,

test(x) = x[1]^3*log(x[2])*sqrt(x[3])

then what is the best way to use ForwardDiff to compute higher-order partial derivatives of this function?

I tried the following
first_d(x) = ForwardDiff.gradient(test,x)
second_d(x) = ForwardDiff.hessian(test,x)
third_d(x) = ForwardDiff.jacobian(second_d,x)
p = [1.5, 2.0, 3.0]
first_d(p)
second_d(p)
third_d(p)

but this general approach seems very slow for functions containing a few more variables.

Can you quantify “very slow”? Compared to what? Just to give a number:

julia> using BenchmarkTools

julia> @btime third_d($p)
  2.704 ÎĽs (11 allocations: 4.97 KiB)

For higher derivatives you can try TaylorSeries.jl.

Hi Kristoffer,

so the problem that I am having with speed relates to the compliation time, actually. Once compiled my function only takes a few thousanths of a second to run, but it takes well-over a minute to compile. Unfortunately, the function is not one that gets run over and over again so the first time it runs is the time that matters. I guess I’m willing to sacrifice some run time in order to get a shorter compilation time.

Thanks. I had a look at TaylorSeries earlier today before posting; I’ll take another look.

Depending on the amount of runtime performance you’re willing to lose, you might want to try interpreting.

julia> using JuliaInterpreter

julia> @btime @interpret third_d($p);
  47.894 ms (149692 allocations: 5.71 MiB)

The first @interpret call in this place is still a bit slow, but not as bad as your compilation time for nested ForwardDiff. Here’s a fresh julia session:

julia> begin
       using JuliaInterpreter, ForwardDiff
       test(x) = x[1]^3*log(x[2])*sqrt(x[3])
       first_d(x) = ForwardDiff.gradient(test,x)
       second_d(x) = ForwardDiff.hessian(test,x)
       third_d(x) = ForwardDiff.jacobian(second_d,x)
       p = [1.5, 2.0, 3.0]
       end;

julia> @time @interpret third_d(p)
  4.670857 seconds (10.34 M allocations: 509.138 MiB, 3.84% gc time)

julia> @time @interpret third_d(p);
  0.049506 seconds (149.71 k allocations: 5.712 MiB)

The other nice thing about this approach is that you re-use the compilation machinery from function call to function call so you pay the compile price the first time you interpret something but no more later on.

You can play around with the chunk size parameter. A lower chunk size should lower compilation time at some cost to execution time.

http://www.juliadiff.org/ForwardDiff.jl/stable/user/advanced.html#Configuring-Chunk-Size-1

A chunk size of one seems to work really well:

julia> using ForwardDiff

julia> test(x) = x[1]^3*log(x[2])*sqrt(x[3])
test (generic function with 1 method)

julia> first_d(x) = ForwardDiff.gradient(test,x,ForwardDiff.GradientConfig(test,x,ForwardDiff.Chunk{1}()))
first_d (generic function with 1 method)

julia> second_d(x) = ForwardDiff.hessian(test,x,ForwardDiff.HessianConfig(test,x,ForwardDiff.Chunk{1}()))
second_d (generic function with 1 method)

julia> third_d(x) = ForwardDiff.jacobian(second_d,x,ForwardDiff.JacobianConfig(second_d,x,ForwardDiff.Chunk{1}()))
third_d (generic function with 1 method)

julia> p = [1.5, 2.0, 3.0]
3-element Array{Float64,1}:
 1.5
 2.0
 3.0

julia> @time third_d(p)
  1.531497 seconds (1.70 M allocations: 85.207 MiB, 7.16% gc time)
9Ă—3 Array{Float64,2}:
  7.2034     7.79423     1.80085  
  7.79423   -2.92284     0.974279 
  1.80085    0.974279   -0.225106 
  7.79423   -2.92284     0.974279 
 -2.92284    1.46142    -0.24357  
  0.974279  -0.24357    -0.0811899
  1.80085    0.974279   -0.225106 
  0.974279  -0.24357    -0.0811899
 -0.225106  -0.0811899   0.0562765

julia> @time third_d(p)
  0.000016 seconds (37 allocations: 4.500 KiB)
9Ă—3 Array{Float64,2}:
  7.2034     7.79423     1.80085  
  7.79423   -2.92284     0.974279 
  1.80085    0.974279   -0.225106 
  7.79423   -2.92284     0.974279 
 -2.92284    1.46142    -0.24357  
  0.974279  -0.24357    -0.0811899
  1.80085    0.974279   -0.225106 
  0.974279  -0.24357    -0.0811899
 -0.225106  -0.0811899   0.0562765

julia> using BenchmarkTools

julia> @btime third_d($p)
  4.362 ÎĽs (33 allocations: 4.34 KiB)
9Ă—3 Array{Float64,2}:
  7.2034     7.79423     1.80085  
  7.79423   -2.92284     0.974279 
  1.80085    0.974279   -0.225106 
  7.79423   -2.92284     0.974279 
 -2.92284    1.46142    -0.24357  
  0.974279  -0.24357    -0.0811899
  1.80085    0.974279   -0.225106 
  0.974279  -0.24357    -0.0811899
 -0.225106  -0.0811899   0.0562765

julia> versioninfo()
Julia Version 1.4.0-DEV.106
Commit 6a20ad7eaa* (2019-09-08 06:58 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.0 (ORCJIT, haswell)
Environment:
  JULIA_NUM_THREADS = 4

Thanks for the suggestion (and you too Elrod). Making the chuck size equal to 1 does lower the compliation time significantly.