Help with recursive type for automatic differentiation (Taylor Numbers) as in a Haskell example

Hello, I am about to try out to implement Taylor numbers for automatic differentiation (just as a simple example for myself). Y have found a very neat version written in Haskell here: http://sriku.org/blog/2019/03/12/automatic-differentiation-dual-numbers--taylor-numbers/
however, I cannot get the pint of how to translate it to julia code in a julian way.

My main questions are:

  1. How to implement a recursive type as is done in the example.
  2. How to define the recursive function dfn.

Would be glad to have a hint from someone who can read Haskell (I can’t).

2 Likes

I wrote up an example implementation of a very simple automatic differentiation tool in Julia for a conference talk: https://github.com/rdeits/DynamicWalking2018.jl/blob/master/notebooks/3.%20The%20Power%20of%20Multiple%20Dispatch.ipynb . Perhaps that will be helpful :slightly_smiling_face:

Oh, I see, you’re looking for a recursive type so that you can take multiple derivatives at once. My example won’t be very helpful, alas.

Yeah, thanks, I already have a working example with dual numbers :wink: I want it all :smiley:

I’m no Haskell user, but here’s an example implementing forward mode autodiff using a recursive type (as far as I understand the term)

struct Dual{T <: Number} <: Number # Notice Dual can hold a Dual!
    value::T
    partial::T
end

derivative(f, x) = f(Dual(x, one(x))).partial
Base.adjoint(f::Function) = x -> derivative(f, x)

const ϵ = Dual(false, true)

import Base: convert, promote_rule

convert(::Type{Dual{T}}, x::Dual{T}) where {T <: Number} = x
convert(::Type{Dual{T}}, x::Dual)    where {T <: Number} = Dual{T}(convert(T, x.value), convert(T, x.partial))
convert(::Type{Dual{T}}, x::Number)  where {T <: Number} = Dual{T}(convert(T, x), zero(T))

promote_rule(::Type{Dual{T}}, ::Type{Dual{U}}) where {T<:Number, U<:Number} = Dual{promote_type(T, U)}
promote_rule(::Type{Dual{T}}, ::Type{U})  where {T<:Number, U<:Number}      = Dual{promote_type(T, U)}
promote_rule(::Type{U}, ::Type{Dual{T}})  where {T<:Number, U<:Number}      = Dual{promote_type(U, T)}

Dual(x::T, y::V) where {T, V} = Dual(promote(x, y)...)

Now we can teach various functions how to deal with Duals

import Base: +, -
(+)(a::Dual, b::Dual) = Dual(a.value + b.value, a.partial + b.partial)
(+)(a::Dual, b::Number) = Dual(a.value + b, a.partial)
(+)(a::Number, b::Dual) = b + a 

(-)(a::Dual)           = Dual(-a.value, -a.partial)
(-)(a::Dual, b::Dual)  = a + -b
(-)(a::Dual, b::Number) = Dual(a.value - b, a.partial)
(-)(a::Number, b::Dual) = Dual(a - b.value, -b.partial)

import Base: *, inv, /
(*)(a::Dual, b::Dual) = Dual(a.value * b.value, (a.partial * b.value) + (a.value * b.partial))
(*)(a::Dual, b::Number) = Dual(a.value * b, a.partial * b)
(*)(a::Number, b::Dual) = Dual(a * b.value, a * b.partial)
(/)(a::Dual, b::Dual) = a * inv(b)
inv(a::Dual)          = Dual(1/(a.value), -a.partial/(a.value)^2)

import Base: ^, exp, log, sqrt
exp(a::Dual)  = Dual(exp(a.value), a.partial * exp(a.value))
log(a::Dual)  = Dual(log(a.value), a.partial/a.value)
sqrt(x::Dual) = Dual(√(x.value), x.partial/(2*√(x.value)))
(^)(a::Dual, b::Dual) = exp(b * log(a))
(^)(a::Dual, b::Integer)      = Dual((a.value)^b, a.partial * b * (a.value)^(b-1)) 

Now we can test it out at the REPL:

julia> f(x) = 2x - 3^x
f (generic function with 1 method)

julia> f'(2.0)
-7.887510598012989

julia> 2 - 3^2 * log(3)
-7.887510598012987

julia> f''(2.0)
-10.86254064731324

julia> - 3^2.0 * log(3)^2
-10.862540647313239

julia> f'''(2.0)
-11.933720641295169

julia> - 3^2.0 * log(3)^3
-11.933720641295167

As you can see, this works for higher order derivatives. The reason is that we wrote the Dual type as allowed to store any number:

struct Dual{T <: Number} <: Number
    value::T
    partial::T
end

and the derivative function

derivative(f, x) = f(Dual(x, one(x))).partial

as creating a dual where the differential part is one(x). If x is already a Dual{T}, this creates a Dual{Dual{T}}, and extracting out the partial part will yield a Dual{T}. This allows higher order derivatives without perturbation confusion.

6 Likes

Hello and thanks. This was exactly the thing I searched for. Funny enough, I already have the dual number defined in the very same way but have overseen the possibility that the value itself can be a Dual as well.

Why do you need Dual + Number and Number + Dual when convert and promote are defined?

1 Like

No problem, I struggled with this for a long time!

Whoops, I don’t actually need that. I copy pasted this out of one of my notebooks I had lying around and tried to take out all the fancy stuff I was doing to improve performance, but must have missed those methods.

Because of the possibility of NaN in floating point numbers, doing ::Dual + ::Number as ::Dual + promote(Dual, ::Number) is less efficient because the compiler isn’t able to guarantee that x::Float + zero(Number) == x:

julia> NaN + zero(Float64) == NaN
false

so there ends up being a bunch of essentially redundant code that won’t get automatically elided by the compiler. Hence, it’s more efficient to manually create methods like ::Dual + ::Number and so on, but it’s not necessary.

Ah, ok, thanks. I was thinking about the possibility that this is done for performance reasons but have not seen the exact case where it could matter. Well, I will probably leave it out in the simple example anyway :wink:

Was able to do fairly direct transcription ( with very loose typing ). For performance, you’d likely want to use generated functions or macros.

struct Taylor
      x
      dx
end

Taylor(x) = Taylor(x,1)
f(t::Taylor) = t.x
df(t::Taylor) = t.dx
f(x::Real) = x
df(x::Real) = zero(x)

dfn(::Val{0}) = f
dfn(n::Integer) = dfn(Val(n))
dfn(::Val{N}) where N = dfn(Val(N-1))∘df


        Base.:+(a::Taylor, b::Taylor) = Taylor(f(a)+f(b), df(a)+df(b))
        Base.:*(a::Taylor, b::Taylor) = Taylor(f(a)*f(b), df(a)*b+a*df(b))
        Base.:+(s::T, b::Taylor) where T<:Real = Taylor(s+f(b),df(b))
        Base.:+(b::Taylor, s::T) where T<:Real = Taylor(s+f(b),df(b))
        Base.:*(s::T, b::Taylor) where T<:Real = Taylor(s*f(b),s*df(b))
        Base.:*(b::Taylor, s::T) where T<:Real = Taylor(s*f(b),s*df(b))

julia> s3 = Taylor(3)*Taylor(3)
Taylor(9, Taylor(6, 2))

julia> f(df(s3))
6

julia> f(df(df(s3)))
2

julia> f(df(df(df(s3))))
0

julia> dfn(2)(s3)
2


2 Likes

I will mark this one as the solution because it is the direct transcription of the Haskell code. However, probably the upper answer

is the better way o go in julia.

1 Like

As described in some of my other posts, I have also implemented higher order recursive dual numbers:

It’s not fully optimized for automatic differentiation yet, but has the general functionality for doing dual quaternions and other geometries.

1 Like

There’s also https://github.com/JuliaDiff/TaylorSeries.jl

1 Like