Hi,
Thanks for the comments. Unfortunately the real case has O(500) parameters, and then I suspect that forward mode AD will also be significantly slower (it will require O(500) passes). Moreover ForwardDiff.jl
has also the problem that it is difficult to evaluate the derivatives on custom types…
I am new to Julia for these things, and I have to say that I am surprised. Given the preference of Zygote.jl
in ML applications, this kind of loss functions (with a loop over the data), is basically the cornerstone of any least squares/Bayesian Inference problem. Do you happen to know the approach taken by Flux.jl
. It just lives with the allocations and the slower speed?
Thanks!