How to derivative using ForwardDiff

I want any help. I want derivative a quadratic form function

using ForwardDiff

X,T = 80,52

param = [σ,μ]
function q(param::AbstractVector{T}) where T
    σ = param[1]
    V = zeros(T,T)
    for i in 1:T
        for j in 1:T
            V[i,j] = minimum([i,j]) * σ^2
        end
    end


    #buf_μ = Zygote.Buffer(μ)
    μ = param[2]
    μ_vec = zeros(T)
    for t in 1:T
        μ_vec[t] = t * μ
    end

    return μ_vec' * (V \ μ_vec)
end

q([1.0,1.0])


ForwardDiff.gradient(q,[1.0,1.0])

#isa(q_μ(μ), Union{Real,AbstractArray})

but it does not work. here is error message

MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(q), Float64}, Float64, 2})
Closest candidates are:
(::Type{T})(::Real, ::RoundingMode) where T<:AbstractFloat at rounding.jl:200
(::Type{T})(::T) where T<:Number at boot.jl:772
(::Type{T})(::AbstractChar) where T<:Union{AbstractChar, Number} at char.jl:50

Stacktrace:
[1] convert(#unused#::Type{Float64}, x::ForwardDiff.Dual{ForwardDiff.Tag{typeof(q), Float64}, Float64, 2})
@ Base ./number.jl:7
[2] setindex!(::Matrix{Float64}, ::ForwardDiff.Dual{ForwardDiff.Tag{typeof(q), Float64}, Float64, 2}, ::Int64, ::Int64)
@ Base ./array.jl:968
[3] q(param::Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(q), Float64}, Float64, 2}})
@ Main ./In[3]:12
[4] vector_mode_dual_eval!
@ ~/.julia/packages/ForwardDiff/vXysl/src/apiutils.jl:24 [inlined]
[5] vector_mode_gradient(f::typeof(q), x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(q), Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(q), Float64}, Float64, 2}}})
@ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/gradient.jl:89
[6] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(q), Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(q), Float64}, Float64, 2}}}, ::Val{true})
@ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/gradient.jl:0
[7] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{typeof(q), Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{typeof(q), Float64}, Float64, 2}}}) (repeats 2 times)
@ ForwardDiff ~/.julia/packages/ForwardDiff/vXysl/src/gradient.jl:17
[8] top-level scope
@ In[3]:28

Could anyone tell me how to solve this problem?

You have a confusion between the type parameter and variable name T.
I think you want T to be the length of the vector. But T in AbstractVector{T} is the element type.

Quite a few things here…

You define an array to hold σ and μ, but they have not been defined at this point, so this line errors. I’m going to guess you meant that line to be a comment, which should then start with a #.

As mentioned, you are confusing the type parameter, T, with a size stored in a variable T. You don’t really need parameterized types and using an AbstractVector is probably not a great idea either.

With those fixed, your function at least executes:

using ForwardDiff

X,T = 80,52

# param = [σ,μ]
function q(param)
    σ = param[1]
    V = zeros(T,T)
    for i in 1:T
        for j in 1:T
            V[i,j] = minimum([i,j]) * σ^2
        end
    end


    #buf_μ = Zygote.Buffer(μ)
    μ = param[2]
    μ_vec = zeros(T)
    for t in 1:T
        μ_vec[t] = t * μ
    end

    return μ_vec' * (V \ μ_vec)
end

q([1.0,1.0])

I’ll leave comments on ForwardDiff’s ability to handle your function to more knowledgeable people.

And here is a version that actually works:

using ForwardDiff

function q(param; T)
    σ, μ = param
    R = eltype(param)
    V = zeros(R, T, T)
    μ_vec = zeros(R, T)
    for i in 1:T, j in 1:T
        V[i,j] = min(i, j) * σ^2
    end
    for i in 1:T
        μ_vec[i] = i * μ
    end
    return μ_vec' * (V \ μ_vec)
end

q([1.0, 1.0]; T=52)
ForwardDiff.gradient(param -> q(param; T=52), [1.0, 1.0])

The reason why ForwardDiff.jl errored is because it needs your function to work with arbitrary subtypes of Real, which includes so-called “dual numbers”. But when you create V and μ_vec, you implicitly declare their elements to be of type Float64 (because zeros(k) is equivalent to zeros(Float64, k)). As a result, when param has elements of type Dual, Julia doesn’t know how to convert them back to Float64 to store things like min(i, j) * σ^2 in your array.
How did I fix this? By specifying that the elements of V and μ_vec should have the same type as the elements of param, which I denoted by R. This is an instance of generic programming, and you’ll run into it quite often in Julia.

Other minor remarks:

  • You don’t need to allocate an array [i, j] when computing the minimum of i and j: the function min(i, j) works for two arguments and will be much faster
  • You don’t seem to use X so I removed it
  • I gave T as a keyword argument to the function, whereas in your code it was a global variable (which can negatively impact performance)
7 Likes

Thanks a lot for your help.
It worked!!

1 Like

Great! Could you mark my answer as solution, so that others know they don’t have to weigh in?

1 Like