Thanks. I think I’m too much of an AD novice to understand how a mixed-mode implementation works at this point, or what its benefits are. I’ve moved on to trying to get my original ODE problem to work more or less as originally implemented, except without hardcoding a Float64 array. I have run into a different problem, with a slightly different simplified example:
function f(c)
β = 0.8 # 0.7 works
F = 1.0 .- β*[0,1,2]
x = zeros(eltype(param(0.0)), 3)
for i in 1:2
x[i+1] = x[i] - c*x[i]
end
return sum(x.^2)
end
derivative(f, param(0.25))
gives:
DomainError with -0.050000000000000044:
log will only return a complex result if called with a complex argument. Try log(Complex(x)).
Stacktrace:
[1] throw_complex_domainerror(::Symbol, ::Float64) at ./math.jl:31
[2] log(::Float64) at ./special/log.jl:285
[3] _forward at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/scalar.jl:55 [inlined]
[4] #track#1 at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/Tracker.jl:50 [inlined]
[5] track at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/Tracker.jl:50 [inlined]
[6] log at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/scalar.jl:57 [inlined]
[7] #218 at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/scalar.jl:66 [inlined]
[8] back_(::Flux.Tracker.Grads, ::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##218#219")){Flux.Tracker.TrackedReal{Float64},Int64},Tuple{Flux.Tracker.Tracked{Float64},Nothing}}, ::Int64) at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/back.jl:103
[9] back(::Flux.Tracker.Grads, ::Flux.Tracker.Tracked{Float64}, ::Int64) at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/back.jl:118
[10] #4 at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/back.jl:106 [inlined]
[11] foreach at ./abstractarray.jl:1836 [inlined]
[12] back_(::Flux.Tracker.Grads, ::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##202#203")),Tuple{Flux.Tracker.Tracked{Float64},Flux.Tracker.Tracked{Float64}}}, ::Int64) at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/back.jl:106
[13] back(::Flux.Tracker.Grads, ::Flux.Tracker.Tracked{Float64}, ::Int64) at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/back.jl:118
[14] #6 at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/back.jl:131 [inlined]
[15] (::getfield(Flux.Tracker, Symbol("##9#11")){getfield(Flux.Tracker, Symbol("##6#7")){Params,Flux.Tracker.TrackedReal{Float64}}})(::Int64) at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/back.jl:140
[16] gradient(::Function, ::Flux.Tracker.TrackedReal{Float64}) at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/back.jl:152
[17] derivative(::Function, ::Flux.Tracker.TrackedReal{Float64}) at /Users/nurban/.julia/packages/Flux/UHjNa/src/tracker/back.jl:155
[18] top-level scope at In[448]:12
The failure depends on the value of the forcing coefficient β; changing it to 0.7 works. After some diagnosis, it appears that the problem occurs at β>0.75, when the gradient evaluated at c=0.25 goes from negative to positive.
This is a case where the gradient exists and a finite difference approximation does fine as far as I can tell, but AD fails, apparently due to a logarithm lurking somewhere inside the chain rule. Any suggestions? Maybe this can be worked around by hand-coding part of the gradient if I write it out by hand, but I’d rather not try to do that in my more complicated real problem.