Higher derivatives using ForwardDiff

Hi All,

If I have a function, say,

test(x) = x[1]^3*log(x[2])*sqrt(x[3])

then what is the best way to use ForwardDiff to compute higher-order partial derivatives of this function?

I tried the following
first_d(x) = ForwardDiff.gradient(test,x)
second_d(x) = ForwardDiff.hessian(test,x)
third_d(x) = ForwardDiff.jacobian(second_d,x)
p = [1.5, 2.0, 3.0]
first_d(p)
second_d(p)
third_d(p)

but this general approach seems very slow for functions containing a few more variables.

1 Like

Can you quantify “very slow”? Compared to what? Just to give a number:

julia> using BenchmarkTools

julia> @btime third_d($p)
  2.704 μs (11 allocations: 4.97 KiB)
3 Likes

For higher derivatives you can try TaylorSeries.jl.

1 Like

Hi Kristoffer,

so the problem that I am having with speed relates to the compliation time, actually. Once compiled my function only takes a few thousanths of a second to run, but it takes well-over a minute to compile. Unfortunately, the function is not one that gets run over and over again so the first time it runs is the time that matters. I guess I’m willing to sacrifice some run time in order to get a shorter compilation time.

Thanks. I had a look at TaylorSeries earlier today before posting; I’ll take another look.

Depending on the amount of runtime performance you’re willing to lose, you might want to try interpreting.

julia> using JuliaInterpreter

julia> @btime @interpret third_d($p);
  47.894 ms (149692 allocations: 5.71 MiB)

The first @interpret call in this place is still a bit slow, but not as bad as your compilation time for nested ForwardDiff. Here’s a fresh julia session:

julia> begin
       using JuliaInterpreter, ForwardDiff
       test(x) = x[1]^3*log(x[2])*sqrt(x[3])
       first_d(x) = ForwardDiff.gradient(test,x)
       second_d(x) = ForwardDiff.hessian(test,x)
       third_d(x) = ForwardDiff.jacobian(second_d,x)
       p = [1.5, 2.0, 3.0]
       end;

julia> @time @interpret third_d(p)
  4.670857 seconds (10.34 M allocations: 509.138 MiB, 3.84% gc time)

julia> @time @interpret third_d(p);
  0.049506 seconds (149.71 k allocations: 5.712 MiB)

The other nice thing about this approach is that you re-use the compilation machinery from function call to function call so you pay the compile price the first time you interpret something but no more later on.

1 Like

You can play around with the chunk size parameter. A lower chunk size should lower compilation time at some cost to execution time.

http://www.juliadiff.org/ForwardDiff.jl/stable/user/advanced.html#Configuring-Chunk-Size-1

2 Likes

A chunk size of one seems to work really well:

julia> using ForwardDiff

julia> test(x) = x[1]^3*log(x[2])*sqrt(x[3])
test (generic function with 1 method)

julia> first_d(x) = ForwardDiff.gradient(test,x,ForwardDiff.GradientConfig(test,x,ForwardDiff.Chunk{1}()))
first_d (generic function with 1 method)

julia> second_d(x) = ForwardDiff.hessian(test,x,ForwardDiff.HessianConfig(test,x,ForwardDiff.Chunk{1}()))
second_d (generic function with 1 method)

julia> third_d(x) = ForwardDiff.jacobian(second_d,x,ForwardDiff.JacobianConfig(second_d,x,ForwardDiff.Chunk{1}()))
third_d (generic function with 1 method)

julia> p = [1.5, 2.0, 3.0]
3-element Array{Float64,1}:
 1.5
 2.0
 3.0

julia> @time third_d(p)
  1.531497 seconds (1.70 M allocations: 85.207 MiB, 7.16% gc time)
9×3 Array{Float64,2}:
  7.2034     7.79423     1.80085  
  7.79423   -2.92284     0.974279 
  1.80085    0.974279   -0.225106 
  7.79423   -2.92284     0.974279 
 -2.92284    1.46142    -0.24357  
  0.974279  -0.24357    -0.0811899
  1.80085    0.974279   -0.225106 
  0.974279  -0.24357    -0.0811899
 -0.225106  -0.0811899   0.0562765

julia> @time third_d(p)
  0.000016 seconds (37 allocations: 4.500 KiB)
9×3 Array{Float64,2}:
  7.2034     7.79423     1.80085  
  7.79423   -2.92284     0.974279 
  1.80085    0.974279   -0.225106 
  7.79423   -2.92284     0.974279 
 -2.92284    1.46142    -0.24357  
  0.974279  -0.24357    -0.0811899
  1.80085    0.974279   -0.225106 
  0.974279  -0.24357    -0.0811899
 -0.225106  -0.0811899   0.0562765

julia> using BenchmarkTools

julia> @btime third_d($p)
  4.362 μs (33 allocations: 4.34 KiB)
9×3 Array{Float64,2}:
  7.2034     7.79423     1.80085  
  7.79423   -2.92284     0.974279 
  1.80085    0.974279   -0.225106 
  7.79423   -2.92284     0.974279 
 -2.92284    1.46142    -0.24357  
  0.974279  -0.24357    -0.0811899
  1.80085    0.974279   -0.225106 
  0.974279  -0.24357    -0.0811899
 -0.225106  -0.0811899   0.0562765

julia> versioninfo()
Julia Version 1.4.0-DEV.106
Commit 6a20ad7eaa* (2019-09-08 06:58 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.0 (ORCJIT, haswell)
Environment:
  JULIA_NUM_THREADS = 4
1 Like

Thanks for the suggestion (and you too Elrod). Making the chuck size equal to 1 does lower the compliation time significantly.