Automatic Differentiation Slow (Slower than Finite-Differences)

Profiling with

using ProfileView
Profile.clear()
@profile @btime ForwardDiff.gradient!($results,$logit_GMM,$b) evals = 100
ProfileView.view()

shows that almost half the time is being spent doing garbage collection (unless this is somehow an artifact of the way I’m profiling), so I do think it’s worth preallocating more. Almost all of the remaining time is spent in the generic_matvecmul! call on line 18, i.e.

EΔm = (Δm_mat' * s) ./ size(ranges,1)

As this is just a linear transformation of s (Δm_mat is constant), just taking the gradient of this step manually (trivial) may be a good solution. You could e.g. extract that part out into a function and specialize it for ForwardDiff.Dual inputs.