# How to optimize Jacobian ForwardDiff.jl

Hello, I manual calculate Jacobian for the system

``````function TM(u, p, t)
U(y) = p[8] + p[9] / ( 1.0 + exp( -50.0 * (y - p[7]) ) )
σ(x) = 1.0 / ( 1.0 + exp( -20.0 * (x-p[6]) ) )
du1 = (-u[1] + p[1] * log( 1.0 + exp( (p[5] * U(u[3]) * u[2] * u[1] + p[11]  ) / (p[1]) ) ) ) / p[2]
du2 = (1.0 - u[2])/p[3] - U(u[3])*u[2]*u[1]
du3 = (-u[3])/p[4] + p[10] * σ(u[2])
return SA[du1, du2, du3]
end
``````

Jacobian

``````U(y, p) = p[8] + p[9] / ( 1.0 + exp( -50.0 * (y - p[7]) ) )
σ(x, p) = 1.0 / ( 1.0 + exp( -20.0 * (x - p[6]) ) )
U_y(y, p) = ( 50.0 * p[9] * exp(-50.0 * (y - p[7])) ) / (1.0 + exp( -50.0*(y - p[7]) ) )^2
g(E, x, y, p) = exp((p[5]  * U(y, p) * x * E + p[11]) / p[1])
σ_der(x, p) = exp( (-20.0) * (x - p[6]) )
function jacob_TM_(u, p, t)

E_E = ( -1.0 + ((J * U(u[3], p) * u[2] * g(u[1], u[2], u[3], p))) / (1.0 + g(u[1], u[2], u[3], p)) ) / p[2]
E_x = (p[5] * U(u[3], p) * u[1] * g(u[1], u[2], u[3], p)) / (p[2] * (1.0 + g(u[1], u[2], u[3], p) ))
E_y = ( p[5]  * U_y(u[3], p) * u[2] * u[1] * g(u[1], u[2], u[3], p) ) / (p[2] * (1.0 + g(u[1], u[2], u[3], p)) )

x_E = -U(u[3], p)*u[2]
x_x = -1.0 / p[3] - U(u[3], p)*u[1]
x_y = (-U_y(u[3], p)) * u[2] * u[1]

y_x = 20.0 * p[10] * σ_der(u[2], p) / (1.0 + σ_der(u[2], p))^2
y_y = -1.0/p[4]

SMatrix{3,3}(E_E, x_E, 0.0,
E_x, x_x, y_x,
E_y, x_y, y_y)
end
``````

I compared my jacobian and the ForwardDiff jacobian

``````BenchmarkTools.Trial: 10000 samples with 600 evaluations.
Range (min … max):  195.833 ns …  40.754 μs  ┊ GC (min … max): 0.00% … 99.38%
Time  (median):     215.500 ns               ┊ GC (median):    0.00%
Time  (mean ± σ):   222.815 ns ± 567.948 ns  ┊ GC (mean ± σ):  3.60% ±  1.41%

▄       ▁▂   ▆█
▂▂▃▂▄▆▆█▆▂▂▁▁▂▃██▃▄▆██▇▅▃▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
196 ns           Histogram: frequency by time          262 ns <

Memory estimate: 80 bytes, allocs estimate: 1.
``````
``````BenchmarkTools.Trial: 10000 samples with 254 evaluations.
Range (min … max):  283.858 ns … 135.250 μs  ┊ GC (min … max): 0.00% … 99.67%
Time  (median):     344.094 ns               ┊ GC (median):    0.00%
Time  (mean ± σ):   378.551 ns ±   1.993 μs  ┊ GC (mean ± σ):  9.03% ±  1.72%

▆▃▅       ▄█▆▃▅▁
███▄▃▂▂▁▂███████▇▄▃▂▂▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
284 ns           Histogram: frequency by time          600 ns <

Memory estimate: 336 bytes, allocs estimate: 4.
``````

Then i calculate lyapunovspectrum with `DynamicalSystem.jld` and spectrum with manual Jacobian slower

``````BenchmarkTools.Trial: 18 samples with 1 evaluation.
Range (min … max):  277.929 ms … 287.031 ms  ┊ GC (min … max): 0.00% … 0.00%
Time  (median):     281.196 ms               ┊ GC (median):    0.00%
Time  (mean ± σ):   281.640 ms ±   3.042 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

█                             ▃
▇▇▁▁▇▇▁█▁▁▇▁▁▁▁▁▇▁▁▁▁▁▁▁▁▁▁▇▁▁▁▁▁▁▁▁▇█▇▁▁▁▁▁▁▁▇▇▁▁▁▁▁▁▁▁▇▁▁▁▇ ▁
278 ms           Histogram: frequency by time          287 ms <

Memory estimate: 3.28 KiB, allocs estimate: 47.
``````
``````BenchmarkTools.Trial: 26 samples with 1 evaluation.
Range (min … max):  195.002 ms … 201.918 ms  ┊ GC (min … max): 0.00% … 0.00%
Time  (median):     195.455 ms               ┊ GC (median):    0.00%
Time  (mean ± σ):   196.571 ms ±   1.996 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

█▂ ▂
████▅▁▅▅▁▁▅▁▁▁▁▁▅▁▅▁▁▁▁▁▁▁▅▁▁▁▅▁▅█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▁▁▁▅ ▁
195 ms           Histogram: frequency by time          202 ms <

Memory estimate: 3.28 KiB, allocs estimate: 47.
``````

Why manual Jacobian slower?

1 Like

ForwardDiff is able to do some SIMD that can help in a lot of cases. Also it probably outlined a few things compared to yours, for example, only computing `U(u[3], p)` once instead of many times.

2 Likes

Thank you!
I managed to optimize the time. Is it still possible to improve the result?
Jacobian:

``````BenchmarkTools.Trial: 10000 samples with 973 evaluations.
Range (min … max):  75.026 ns …  24.038 μs  ┊ GC (min … max): 0.00% … 99.52%
Time  (median):     93.320 ns               ┊ GC (median):    0.00%
Time  (mean ± σ):   98.275 ns ± 410.607 ns  ┊ GC (mean ± σ):  7.22% ±  1.72%

▅▂▇▅▄▃▂▁▁         ▅▇▅██▅▃▂▂▂▁▁▁▁▁▁▁▁                         ▂
██████████▇▆▆▅▅▆▄██████████████████████▇▇▇▇▆▆▆▆▆▇███▇▇▆▆▆▆▆▆ █
75 ns         Histogram: log(frequency) by time       127 ns <

Memory estimate: 80 bytes, allocs estimate: 1.
``````
``````BenchmarkTools.Trial: 10000 samples with 267 evaluations.
Range (min … max):  286.142 ns … 98.949 μs  ┊ GC (min … max):  0.00% … 99.54%
Time  (median):     349.813 ns              ┊ GC (median):     0.00%
Time  (mean ± σ):   378.079 ns ±  1.933 μs  ┊ GC (mean ± σ):  10.21% ±  1.99%

▆▇▆▅▄▃▂▁     ▂▃▅▃▅▇█▇▅▄▃                       ▁ ▁           ▂
█████████▇█▇▇████████████▇▇▇▅▆▆▆▅▆▆▇▆▇▇▅▅▆▅▄▇▆█████▇▇▇▆▅▅▅▄▆ █
286 ns        Histogram: log(frequency) by time       494 ns <

Memory estimate: 336 bytes, allocs estimate: 4.
``````

lyapunovspectrum:

``````BenchmarkTools.Trial: 32 samples with 1 evaluation.
Range (min … max):  157.645 ms … 163.120 ms  ┊ GC (min … max): 0.00% … 0.00%
Time  (median):     159.589 ms               ┊ GC (median):    0.00%
Time  (mean ± σ):   159.418 ms ±   1.001 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

▁        ▁  ▄       ██▄▁   ▁
▆█▁▆▁▁▁▁▁▁█▁▁█▆▆▁▆▁▁▆████▆▁▆█▁▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆ ▁
158 ms           Histogram: frequency by time          163 ms <

Memory estimate: 3.28 KiB, allocs estimate: 47.
``````
``````BenchmarkTools.Trial: 26 samples with 1 evaluation.
Range (min … max):  194.245 ms … 205.892 ms  ┊ GC (min … max): 0.00% … 0.00%
Time  (median):     197.141 ms               ┊ GC (median):    0.00%
Time  (mean ± σ):   197.773 ms ±   2.971 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

▁▁           ▁   █
▆▁██▆▆▁▆▁▆▁▁▆▆▆█▁▆▆█▁▁▁▁▁▁▁▁▁▁▆▁▆▁▁▁▁▁▁▁▁▆▁▆▁▁▁▆▁▁▁▁▁▁▁▁▁▁▁▁▆ ▁
194 ms           Histogram: frequency by time          206 ms <

Memory estimate: 3.28 KiB, allocs estimate: 47.
``````

Code:

``````@inbounds function jacob_TM_(u, p, t)

U(y, p, exp50) = p[8] + p[9] / ( 1.0 + exp50 )
U_y(y, p, exp50) = (50.0 * p[9] * exp50) / (1.0 + exp50)^2
g(E, x, y, p, U_) = exp((p[5]  * U_ * x * E + p[11]) / p[1])
σ_der(x, p) = exp( (-20.0) * (x - p[6]) )

exp50 = exp(-50.0 * (u[3] - p[7]))

U_ = U(u[3], p, exp50)
Uy = U_y(u[3], p, exp50)
g_ = g(u[1], u[2], u[3], p, U_)
σ_deri = σ_der(u[2], p)

g_plus = 1.0 + g_
g_mult = g_ * U_
g_plus_mult = p[2] * (g_plus)
u1p5 = p[5] * u[1]
Uyu2 = Uy * u[2]

E_E = (-1.0 + ((J * u[2] * g_mult)) / (g_plus) ) / p[2]
E_x = (u1p5 * g_mult) / (g_plus_mult)
E_y = (u1p5 * Uyu2 * g_) / (g_plus_mult)

x_E = -U_ * u[2]
x_x = -1.0 / p[3] - U_ * u[1]
x_y = -Uyu2 * u[1]

y_x = 20.0 * p[10] * σ_deri / (1.0 + σ_deri)^2
y_y = -1.0/p[4]

SMatrix{3,3}(E_E, x_E, 0.0,
E_x, x_x, y_x,
E_y, x_y, y_y)
end
``````