It is definitely a good idea to move dfexpr out of the function body, to help with compile times.
Otherwise, if which derivatives I’m taking is known at compile time, the generated function version doesn’t have to do the “if” checks.
Comparison on a different computer; the _nofm
version means without fastmath:
julia> @benchmark gamma_moment(2,2,Val(2))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 4.175 ns (0.00% GC)
median time: 4.537 ns (0.00% GC)
mean time: 4.686 ns (0.00% GC)
maximum time: 21.161 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
julia> @benchmark gamma_moment(2,2,2)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 23.837 ns (0.00% GC)
median time: 23.851 ns (0.00% GC)
mean time: 24.677 ns (0.00% GC)
maximum time: 44.803 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 996
@benchmark gamma_moment_nofm(2,2,Val(2))
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 15.349 ns (0.00% GC)
median time: 15.489 ns (0.00% GC)
mean time: 15.960 ns (0.00% GC)
maximum time: 37.048 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 998
It’s easy to use the Val
version and support an arbitrary number of runtime derivative numbers:
julia> function gamma_moment(a, b, N::Int)
Base.Cartesian.@nif 5 i-> (i == N) i -> gamma_moment(a,b,Val{i}()) i -> throw(error(string(N)*"-th not supported"))
end
julia> @benchmark gamma_moment(2,2,2)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 14.129 ns (0.00% GC)
median time: 14.131 ns (0.00% GC)
mean time: 14.722 ns (0.00% GC)
maximum time: 30.733 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 998
In my application, which derivatives you’re taking will be known at compile time, so I can stick with that version.
The problem is in wanting to use it for symbolic differentiation is, that it wouldn’t support things like for loops, or repeated assignments (unless I’m mistaken):
function example(a,b)
target = 3*a*b
target += 2*(1-a)*(1-b)
target
end
so the thing to do would be, write a macro that could translate the above expression into the single line :(3*a*b + 2*(1-a)*(1-b))
, taking care to correctly support things like
function example(a,b)
z = 3
target = z*a*b
z -= 1
target += z*(1-a)*(1-b)
target
end
One of my motivations in this is realizing how slow the AD packages I’ve tried are at finding Hessians with respect to x of the form:
f(x,A) = x' * A * x / 2
where the answer should of course simply be “return A”. Instead of being a noop, they’re pretty slow. A symbolic approach should be able to simplify things and get there quite easily. I can’t help but think packages like XGrad would be easier to write if they could use a CAS. Perhaps also combining with DataFlow (as a better @fastmath
) to eliminate redundant code for getting all the derivatives 0,…,k.