Hey,
This is a follow up to this previous post where I was looking for the right derivatives of the gamma_inc
function. I found them. Now i am trying to tell ForwardDiff to use them… with issues.
The function has two arguments a
and x
. The derivtive w.r.t. x is already in ForwardDiff:
ForwardDiff.derivative(x -> gamma_inc(2,x)[2],2.0) # Works
I just added the derivative w.r.t a
:
using SpecialFunctions, HypergeometricFunctions, ForwardDiff
function Γ(a,x)
return gamma_inc(a,x)[2]
end
function ∂Γ_∂a(a,x)
g = gamma(a)
dg = digamma(a)
G = Γ(a,x)
lx = log(x)
r = pFq([a,a],[a+1,a+1],-x)
return a^(-2) * x^a * r/g + (1 - G)*(dg - lx)
end
function SpecialFunctions.gamma_inc(d::ForwardDiff.Dual{T,<:Real}, x::Real, ind::Integer) where {T}
a = ForwardDiff.value(d)
p, q = SpecialFunctions.gamma_inc(a, x, ind)
∂q = ∂Γ_∂a(a,x)
return (ForwardDiff.Dual{T}(p, -∂q), ForwardDiff.Dual{T}(q, ∂q))
end
ForwardDiff.derivative(a -> gamma_inc(a,3.0)[2],2.0) # Works
But somehow, forwardiff does not want to pass dual numbers through both parameters :
ForwardDiff.derivative(a -> gamma_inc(a,3+a)[2],2.0) # Doesnt work.
Fair enough. Remembering my 101 lectures on the chain rule, I craft f(dual,dual)
from f(dual,x)
and f(a,dual)
as follows:
function SpecialFunctions.gamma_inc(da::ForwardDiff.Dual{T,<:Real}, dx::ForwardDiff.Dual{T,<:Real}, ind::Integer) where {T}
a = ForwardDiff.value(da)
x = ForwardDiff.value(dx)
∂1 = SpecialFunctions.gamma_inc(da, x, ind::Integer)
∂2 = SpecialFunctions.gamma_inc(a, dx, ind::Integer)
return ∂1 .* da .+ ∂2 .* dx
end
And now it works :
ForwardDiff.derivative(a -> gamma_inc(a,3+a)[2],2.0) # Works !
Youpi ! So i go back to what i was doing, trying to use it… but no :
ForwardDiff.gradient(x -> gamma_inc(x[1]+x[2],x[1]-x[2])[2],[3.0,1.0]) # Still does not work...
I think that the derivatives i added should be enough to compute this gradient ! I do not understand what I missed. Could someone help?
From the stack trace, I think it may come from the line ∂1 .* da .+ ∂2 .* dx
in my return statement, where, in case we are taking gradients w.r.t. several variables, the multiplications / additions do not work correctly; Maybe i should re-write this somehow differently ?
EDIT :
If I change the function to the following to see things :
function SpecialFunctions.gamma_inc(da::ForwardDiff.Dual{T,<:Real}, dx::ForwardDiff.Dual{T,<:Real}, ind::Integer) where {T}
a = ForwardDiff.value(da)
x = ForwardDiff.value(dx)
∂1 = SpecialFunctions.gamma_inc(da, x, ind::Integer)
∂2 = SpecialFunctions.gamma_inc(a, dx, ind::Integer)
@show da
@show dx
@show ∂1[1]
@show ∂1[1]
@show ∂2[1]
@show ∂2[2]
return (∂1[1] * da + ∂2[1] * dx,∂1[2] * da + ∂2[2] * dx)
end
Then the stacktrace I Have looks like :
julia> ForwardDiff.gradient(x -> gamma_inc(x[1]+x[2],x[1]-x[2])[2],[3.0,1.0]) # Still does not work...
da = Dual{ForwardDiff.Tag{var"#60#61", Float64}}(4.0,1.0,1.0)
dx = Dual{ForwardDiff.Tag{var"#60#61", Float64}}(2.0,1.0,-1.0)
∂1[1] = Dual{ForwardDiff.Tag{var"#60#61", Float64}}(0.1428765395014529,-0.1302623114706677)
∂1[1] = Dual{ForwardDiff.Tag{var"#60#61", Float64}}(0.1428765395014529,-0.1302623114706677)
∂2[1] = Dual{ForwardDiff.Tag{var"#60#61", Float64}}(0.1428765395014529,0.1804470443154836,-0.1804470443154836)
∂2[2] = Dual{ForwardDiff.Tag{var"#60#61", Float64}}(0.8571234604985472,-0.1804470443154836,0.1804470443154836)
ERROR: MethodError: no method matching _mul_partials(::ForwardDiff.Partials{1, Float64}, ::ForwardDiff.Partials{2, Float64}, ::Float64, ::Float64)
Closest candidates are:
_mul_partials(::ForwardDiff.Partials{0, A}, ::ForwardDiff.Partials{N, B}, ::Any, ::Any) where {N, A, B} at C:\Users\lrnv\.julia\packages\ForwardDiff\QdStj\src\partials.jl:141
_mul_partials(::ForwardDiff.Partials{N, A}, ::ForwardDiff.Partials{0, B}, ::Any, ::Any) where {N, A, B} at C:\Users\lrnv\.julia\packages\ForwardDiff\QdStj\src\partials.jl:142
_mul_partials(::ForwardDiff.Partials{N}, ::ForwardDiff.Partials{N}, ::Any, ::Any) where N at C:\Users\lrnv\.julia\packages\ForwardDiff\QdStj\src\partials.jl:118
...
Stacktrace:
[1] dual_definition_retval(#unused#::Val{ForwardDiff.Tag{var"#60#61", Float64}}, val::Float64, deriv1::Float64, partial1::ForwardDiff.Partials{1, Float64}, deriv2::Float64, partial2::ForwardDiff.Partials{2, Float64})
@ ForwardDiff C:\Users\lrnv\.julia\packages\ForwardDiff\QdStj\src\dual.jl:203
[2] *
@ C:\Users\lrnv\.julia\packages\ForwardDiff\QdStj\src\dual.jl:271 [inlined]
[3] gamma_inc(da::ForwardDiff.Dual{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2}, dx::ForwardDiff.Dual{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2}, ind::Int64)
@ Main .\Untitled-1:55
[4] gamma_inc(a::ForwardDiff.Dual{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2}, x::ForwardDiff.Dual{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2})
@ SpecialFunctions C:\Users\lrnv\.julia\packages\SpecialFunctions\hefUc\src\gamma_inc.jl:858
[5] (::var"#60#61")(x::Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2}})
@ Main .\Untitled-1:60
[6] vector_mode_dual_eval!(f::var"#60#61", cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2}}}, x::Vector{Float64})
@ ForwardDiff C:\Users\lrnv\.julia\packages\ForwardDiff\QdStj\src\apiutils.jl:37
[7] vector_mode_gradient(f::var"#60#61", x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2}}})
@ ForwardDiff C:\Users\lrnv\.julia\packages\ForwardDiff\QdStj\src\gradient.jl:106
[8] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2}}}, ::Val{true})
@ ForwardDiff C:\Users\lrnv\.julia\packages\ForwardDiff\QdStj\src\gradient.jl:19
[9] gradient(f::Function, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2, Vector{ForwardDiff.Dual{ForwardDiff.Tag{var"#60#61", Float64}, Float64, 2}}}) (repeats 2 times)
@ ForwardDiff C:\Users\lrnv\.julia\packages\ForwardDiff\QdStj\src\gradient.jl:17
[10] top-level scope
@ Untitled-1:60
julia>
I think the issue is that the four partials only have 2 values : there is only one derivative w.r.t. a
or x
, while da
and dx
have 3 values: there is two derivatives. Which is logic.
How could I modify the ∂1[1]
,∂1[2]
,∂2[1]
and ∂2[2]
to have one more (zero) derivative in them so that the multiplication works ?
EDIT :I tried with ReverseDiff, but no luck yet :
using SpecialFunctions, HypergeometricFunctions, ReverseDiff, ChainRulesCore
function ∂Γ_∂a(a,x)
g = gamma(a)
dg = digamma(a)
G = gamma_inc(a,x)[1]
lx = log(x)
r = pFq([a,a],[a+1,a+1],-x)
return a^(-2) * x^a * r/g + G*(dg - lx)
end
# Copied and modified the chainrule from SpecialFunctions.jl:
ChainRulesCore.@scalar_rule(
SpecialFunctions.gamma_inc(a, x, IND),
@setup(z = exp(-x) * x^(a - 1) / gamma(a)),
(
-∂Γ_∂a(a,x),
z,
ChainRulesCore.NoTangent(),
),
(
∂Γ_∂a(a,x),
-z,
ChainRulesCore.NoTangent(),
),
)
# Shananigans to call reversediff
# I cannot find the right way to call it...
# And when it looks like i manage to clal it correclty, it does not use my chainrule...
gamma_inc_arr(x) = gamma_inc.(x,2.3)
ReverseDiff.gradient(gamma_inc_arr, ([1.0],))