Hi everyone,
I am developping my own flavour of automatic differentiation using Dual numbers. First - why would I ever do that when so much work has been put into ForwardDiff?
ForwardDiff works by transforming functions. You use it with calls like
∇f = ForwardDiff.gradient(f)
slope = ∇f(x)
Totally nifty but this seems to limit the API to “single input single output” functions. One could probably workaround this with closures, but ain’t that awkward. I would have loved to contribute an extension of the API, but I am so out of my depth with the ForwardDiff code that I am afraid it won’t be me…
The underlying dual numbers do not have this limitation though. In my own little implementation a typical usage is (computing the imbalance vector of a finite element)
Δ = δ(y) # perturbation here
R,χ = balance(e,t,y+α*Δ,y∂t+β*Δ,zeros(vsize(y)),χo,dbg)
return value(R),∂(R),χ
Here ∂(R)
is αdR/dy + βdR/dy’ = αK+βC with α=1 and β=1/Δt delivering the goods for an implicit Euler time-stepping. And anyway, there’s always the value of learning something!
I got it all working, including nested derivatives, but precisely because multiple-argument functions may be differentiated, I open the door for perturbation collisions. I know how to handle that from a mathematical point of view, to each point in my code that introduces a perturbation, I must associate a “tag”. When two duals meet they need to compare tags to decide who rides on top.
This leaves me with two questions: how to generate tags and how to compare tags at compile time.
When I try comparing tags like this
combine(a::dR{Ta},b::dR{Tb}) where{Ta<Tb} = println("b rides on top")
Julia gives me a frown.
So I go spying in the ForwardDiff code, but remain largely baffled after multiple parsings. In Dual.jl, in struct Dual{T,V,N} <: Real
clearly T
is the tag. The operator ≺
is used to compare tags, but apart from the method ≺(a,b) = throw(DualMismatchError(a,b))
, where is ≺
defined?
The operator is used in promote rules - not, as I find, in binary operations:
Base.@pure function Base.promote_rule(::Type{Dual{T1,V1,N1}},
::Type{Dual{T2,V2,N2}}) where {T1,V1,N1,T2,V2,N2}
# V1 and V2 might themselves be Dual types
if T2 ≺ T1
Dual{T1,promote_type(V1,Dual{T2,V2,N2}),N1}
else
Dual{T2,promote_type(V2,Dual{T1,V1,N1}),N2}
end
end
Does this resolve at compile time? Thanks to greedy evaluation?
Am I right in assuming that the strategy here is to promote all arguments before binary operations which are defined for arguments with the same stamp? (as opposed to creating binary operations that tackle various types)?
Finally, tag generation in Derivatives.jl
T = typeof(Tag(f, R))
I do not find where Tag
is defined. Am I right to assume that the inputs f
and R
are a way to identify a perturbation?
If you have read that far: thank you!
Philippe