# Allocation-free higher-order derivatives of R^n->R

This question describes a particular AD application, I am either looking for a solution, someone telling me why it is not a good idea, or potential collaborators who are interested in the same problem, which comes up frequently whenever one needs multiple specific partial derivatives of a function (eg PDEs).

Consider a multivariate function, eg call it f(x, y, z, ...), which has real arguments and is composed of a set of βelementaryβ functions of (`+`, `*`, `cos`, etc), implemented in Julia.

I am interested in obtaining a user-specified set of derivatives via AD. This is how the (imaginary) API could work. Letβs say `f(::Tuple{Real,Real})` is bivariate and takes x and y in a tuple.

``````# specify what I need, as a NamedTuple of variable indices
D = derivatives_I_want((fval = (), Dx = (1,), Dxy = (1, 2)))

# seeds with the relevant multivariate generalization of dual numbers
# D determines the order implicitly
lifted_xy = seed(D, (1.0, 2.0))

# arranges what I need in a NamedTuple
(; fval, Dx, Dxy) = extract(D, f(lifted_xy))
``````

The idea is a straightforward extension of ForwardDiff.jl, but the only package I could find that implements something like this is

where I would have to supply the wrapper API (`derivatives_I_want`, `seed`, `extract`) but that is straightforward. The only disadvantage is that the package uses `Vector`s and hash tables internally, so it is allocation-heavy. There was an attempt to make it static at

but it seems stalled.

Also note that while TaylorSeries.jl could be used to implement this kind of API, it is a bit different because it starts from the derivatives the user wants.

Thoughts appreciated. (@dpsanders, if you have the time I would especially value your opinion)

1 Like

This sounds a lot like something FastDifferentiation.jl would be good at. To elaborate, it is a symbolic differentiation package that allows you to query higher-order derivatives with respect to an arbitrary combination of variables. The initial compilation is slow, but you can turn it into a fast executable if you perform the same computation repeatedly

2 Likes

Yes, FastDifferentiation.jl can do exactly what you want. The executables are generally very fast in comparison to programs like ForwardDiff, especially if your derivatives are sparse.

If you want to try it Iβll be happy to help you get started.

Is this what you had in mind?

``````function multiple_derivatives()
@variables x y

f = x^2 * y^2

derivs_I_want = [derivative(f, x), derivative(f, x, y)]

println(derivs_I_want)

exe = make_function(derivs_I_want, [x, y])

println(exe([2.0, 3.0]))
end
export multiple_derivatives
``````
``````julia> multiple_derivatives()
FastDifferentiation.Node[((y ^ 2) * (2 * x)), (4 * (y * x))]
[36.0, 24.0]
``````
2 Likes

Hereβs a more complete example that shows a non-allocating version:

``````module FDExamples
using FastDifferentiation
using BenchmarkTools

function multiple_derivatives()
@variables x y

f = x^2 * y^2

derivs_I_want = [derivative(f, x), derivative(f, x, y)]

println(derivs_I_want)

#allocating version
exe = make_function(derivs_I_want, [x, y])

println(exe([2.0, 3.0]))

#non-allocating version
result = similar(derivs_I_want, Float64)
exe2! = make_function(derivs_I_want, [x, y]; in_place=true, init_with_zeros=false)

exe2!(result, [2.0, 3.0])
println(result)

println("allocating version\n\n")
input = [2.0, 3.0]
display(@benchmark \$exe(\$input))

println("non-allocating version\n\n")
exe2!(result, [2.0, 3.0])

display(@benchmark \$exe2!(\$result, \$input))

do_many(f, res, input) =
for _ in 1:100
f(res, input)
end
println("non-allocating 100 times\n\n")

display(@benchmark \$do_many(\$exe2!, \$result, \$input))

end
export multiple_derivatives

end # module FDExamples
``````

Result of executing `multiple_derivatives`:

``````julia> multiple_derivatives()
FastDifferentiation.Node[((y ^ 2) * (2 * x)), (4 * (y * x))]
[36.0, 24.0]
[36.0, 24.0]
allocating version

BenchmarkTools.Trial: 10000 samples with 998 evaluations.
Range (min β¦ max):  19.038 ns β¦   3.497 ΞΌs  β GC (min β¦ max):  0.00% β¦ 98.77%
Time  (median):     21.142 ns               β GC (median):     0.00%
Time  (mean Β± Ο):   27.759 ns Β± 112.461 ns  β GC (mean Β± Ο):  15.51% Β±  3.81%

ββββββββββββββ                  β                            β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
19 ns         Histogram: log(frequency) by time      57.5 ns <

Memory estimate: 80 bytes, allocs estimate: 1.
non-allocating version

BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min β¦ max):  1.700 ns β¦ 10.300 ns  β GC (min β¦ max): 0.00% β¦ 0.00%
Time  (median):     1.800 ns              β GC (median):    0.00%
Time  (mean Β± Ο):   1.882 ns Β±  0.544 ns  β GC (mean Β± Ο):  0.00% Β± 0.00%

β  β   β   β   β   β                                       β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
1.7 ns       Histogram: log(frequency) by time      3.2 ns <

Memory estimate: 0 bytes, allocs estimate: 0.
non-allocating 100 times

BenchmarkTools.Trial: 10000 samples with 983 evaluations.
Range (min β¦ max):  54.425 ns β¦ 349.644 ns  β GC (min β¦ max): 0.00% β¦ 0.00%
Time  (median):     58.291 ns               β GC (median):    0.00%
Time  (mean Β± Ο):   59.780 ns Β±  10.460 ns  β GC (mean Β± Ο):  0.00% Β± 0.00%

β  β  β βββββββββββββ ββ   β    β   β        β               β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
54.4 ns       Histogram: log(frequency) by time      78.6 ns <

Memory estimate: 0 bytes, allocs estimate: 0.
``````
2 Likes