Errors with autodiff in a for loop

adeyemiadeoye · January 27, 2022, 11:51am

While implementing an on-line optimization method (with a for loop to access some new data), I am using HesVec() and JacVec() of SparseDiffTools.jl to compute sparse Hessian and Jacobian matrices with option autodiff=false for each. Inside the same loop, I have to compute gradients this time with sparse(ForwardDiff.gradient(objectivefn, somevector)). Each of these autodiff tools has to compute the objective value each time it is called (I guess). I do not get any error with HesVec() and JacVec() (I suppose everything is fine!); but after the first step (first data), ForwardDiff.gradient() throws a conversion error in the line where Flux.Losses.logitcrossentropy(yhat, yi) is called in the loss function (apparently, trying to do some ForwardDiff.Dual to AbstractFloat conversion). One guess is that there may be some NaN values somewhere from the computation of the gradients that Julia identifies as #unused# (I don’t know), as shown in the error text below. The problem is I find it difficult to properly trace this error to know the exact cause or to know maybe ForwardDiff.gradient() is unable to handle sparsity or largeness of values of the input vector after the first update?

MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{Main.ModuleA.var"#4#6"{Matrix{Float32}, Main.ModuleA.var"#3#5"{Flux.var"#64#66"{Vector{AbstractArray{Float32}}}, Int64}, Flux.var"#64#66"{Vector{AbstractArray{Float32}}}, Int64}, Float32}, Float64, 12})
Closest candidates are:
  (::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat at C:\Users\user\AppData\Local\Programs\Julia-1.7.1\share\julia\base\rounding.jl:200
  (::Type{T})(::T) where T<:Number at C:\Users\user\AppData\Local\Programs\Julia-1.7.1\share\julia\base\boot.jl:770
  (::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number} at C:\Users\user\AppData\Local\Programs\Julia-1.7.1\share\julia\base\char.jl:50
  ...

Stacktrace:
  [1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{Main.ModuleA.var"#4#6"{Matrix{Float32}, Main.ModuleA.var"#3#5"{Flux.var"#64#66"{Vector{AbstractArray{Float32}}}, Int64}, Flux.var"#64#66"{Vector{AbstractArray{Float32}}}, Int64}, Float32}, Float64, 12})
    @ Base .\number.jl:7

On the other hand, I would like to know if there is a way to “record operations for autodiff” as in tensorflow’s GradientTape, where I could call my objective function only once and use its value to compute the various derivatives with respect to a “watched” variable? With this, I think it would be easier to maybe trace this error. As I suspect Julia autodiff could be manipulating the input vector types somehow.

Thank you.

ChrisRackauckas · January 27, 2022, 12:58pm

You need to make sure your caches accept dual numbers. See GitHub - SciML/PreallocationTools.jl: Speed at all costs as a tool to help with this.

adeyemiadeoye · January 27, 2022, 10:51pm

That solves the problem! Thanks.

Topic		Replies	Views
Getting ForwardDiff.Dual to propagate through function New to Julia optim , forwarddiff , autodiff	5	771	December 13, 2023
Finding Jacobian using automatic differentiation for a vector function General Usage forwarddiff , reversediff	3	1713	December 14, 2021
Incorrect jacobian using ForwardDiff and sparse functions Numerics differentiation , forwarddiff	4	720	April 23, 2022
Gradient using ForwardDiff.jl not working General Usage forwarddiff , autodiff , gradient	2	70	November 28, 2024
ForwardDiff.jl and its Dual Type question General Usage package , forwarddiff	4	1077	August 17, 2020

Errors with autodiff in a for loop

Related topics