I noticed an unexpected performance regression when porting a package from 0.6.4 to 0.7 and reduced it to the following example script; output on v0.6.4, v0.7 and v1.0 are below. The performance of ForwardDiff
is remarkably improved, while the symbolic differentiation via Calculus.differentiate
deteriorates significantly. The reason is clear - there are lots of zeros that are not optimised away. Is there an optimisation that I should perform?
using ForwardDiff, Calculus, BenchmarkTools
e0, A, r0 = 1.234, 3.456, 1.012
fex = :( $e0 * (exp(-2*$A*(r/$r0-1.0)) - 2.0*exp(-$A*(r/$r0-1.0))) )
f = eval(:( r -> $fex ))
dfex = Calculus.differentiate(fex, :r)
df = eval(:( r -> $dfex ))
x = 1.0+rand()
print(" f: "); @btime ($f($x))
print(" f': "); @btime ($df($x))
print("FwDiff: "); @btime ForwardDiff.derivative($f, $x)
println("\nCalculus.differentiate expression:"); @show dfex
OUTPUT: (j6=v0.6.4, j7 = v0.7.0, j = v1.0.3)
Fuji-2:scratch ortner$ j6 -O3 calculus_test.jl
f: 18.502 ns (0 allocations: 0 bytes)
f': 17.873 ns (0 allocations: 0 bytes)
FwDiff: 73.476 ns (2 allocations: 32 bytes)
Calculus.differentiate expression:
dfex = :(1.234 * (-6.830039525691699 * exp(-6.912 * (r / 1.012 - 1.0)) - 2.0 * (-3.4150197628458496 * exp(-3.456 * (r / 1.012 - 1.0)))))
Fuji-2:scratch ortner$ j7 -O3 calculus_test.jl
f: 17.845 ns (0 allocations: 0 bytes)
f': 42.202 ns (0 allocations: 0 bytes)
FwDiff: 21.106 ns (0 allocations: 0 bytes)
Calculus.differentiate expression:
dfex = :(0 * (exp(-2 * 3.456 * (r / 1.012 - 1.0)) - 2.0 * exp(-3.456 * (r / 1.012 - 1.0))) + 1.234 * ((0 * 3.456 * (r / 1.012 - 1.0) + -2 * 0 * (r / 1.012 - 1.0) + -2 * 3.456 * (1 / 1.012)) * exp(-2 * 3.456 * (r / 1.012 - 1.0)) - (0 * exp(-3.456 * (r / 1.012 - 1.0)) + 2.0 * ((0 * (r / 1.012 - 1.0) + -3.456 * (1 / 1.012)) * exp(-3.456 * (r / 1.012 - 1.0))))))
Fuji-2:scratch ortner$ j -O3 calculus_test.jl
f: 17.797 ns (0 allocations: 0 bytes)
f': 42.823 ns (0 allocations: 0 bytes)
FwDiff: 21.431 ns (0 allocations: 0 bytes)
Calculus.differentiate expression:
dfex = :(0 * (exp(-2 * 3.456 * (r / 1.012 - 1.0)) - 2.0 * exp(-3.456 * (r / 1.012 - 1.0))) + 1.234 * ((0 * 3.456 * (r / 1.012 - 1.0) + -2 * 0 * (r / 1.012 - 1.0) + -2 * 3.456 * (1 / 1.012)) * exp(-2 * 3.456 * (r / 1.012 - 1.0)) - (0 * exp(-3.456 * (r / 1.012 - 1.0)) + 2.0 * ((0 * (r / 1.012 - 1.0) + -3.456 * (1 / 1.012)) * exp(-3.456 * (r / 1.012 - 1.0))))))