Automatic differentiation of complex valued functions

Does anyone know if a Julia automatic differentiation package that supports complex valued functions (with real argument)?

I have been using ForwardDiff.jl but it is missing this functionality: https://github.com/JuliaDiff/ForwardDiff.jl/issues/364

try Zygote.jl? Or maybe they use the common DiffRules.jl (or maybe it uses ForwardDiff anyways…

One trick is to decompose to real and imaginary parts, then reassemble:

import ForwardDiff
function f(x)
    y = complex(x[1], x[2]) * exp(complex(x[3], x[4]))
    vcat(real.(y), imag.(y))
end
ForwardDiff.jacobian(f, ones(4))
2 Likes

Zygote looks promising but errors with:

ERROR: Output is complex, so the gradient is not defined.

I found an interesting discussion https://github.com/FluxML/Zygote.jl/issues/342

Interesting, thanks. I’ll see if that can solve my problem.

The key words you want to look up here are “Wirtinger derivatives”. AFAIK you may have to wait until various packages use ChainRules.jl before this will work really nicely (I did think Zygote already supported complex derivatives to some extent. Maybe I got the wrong impression about that.).

For a fair amount of discussion related to this see https://github.com/JuliaDiff/ChainRules.jl/search?q=wirtinger&type=Issues

3 Likes

You actually don’t need Wirtinger derivatives for complex functions with real input, this case is actually pretty straightforward with forward-mode AD, so implementing this in ForwardDiff.jl shouldn’t be that difficult. Because Zygote.jl uses reverse-mode AD, it is much better suited for differentiating functions with complex input, but real output. First-class complex differentiation support in ChainRules is still very WIP, and probably won’t be part of v1.0, since there are many challenges in supporting this for both forward- as well as reverse-mode AD. This is the PR working on this in the underlying ChainRulesCore.jl, and quite a bit of ChainRules.jl will have to be changed accordingly as well.

2 Likes

@jtravs Maybe you could explain a bit more about your usecase, to figure out whether forward- or reverse-mode AD would make more sense.

My usecase is rather simple. I have a real function of real argument which I need to calculate the derivative of. Several function layers deep inside that function I have a complex valued function of real argument. Further up the chain I take the real part. Despite this, the fact that the nested function is complex valued prevents ForwardDiff and Zygote from working. Until recently that deeply nested function was real valued and everything worked fine. I was hoping to find the minimum change to my code/packages to get this working. In principle I could separate the real and imaginary parts analytically but I would prefer to avoid that, as I would need to do it for every case.

If it helps, the “deeply nested” complex valued function of real argument I described above knows how to calculate its own derivative, so this can be provided. But I do not understand how to pass that information to e.g. ForwardDiff (so that it doesn’t need to look inside, but can just take it as an opaque function with derivative).

Zygote.jl should definitely be able to handle R -> R functions with intermediary complex functions. Have you actually tried it on your whole function, not just the part that is R -> C?

I hadn’t, but I just tried something similar. If my Real->Complex function is g(x), then running

f(x) = real(g(x))
gradient(f, 1.3)

results in

ERROR: InexactError: Float64(-0.7504565405440041 + 0.2501521801813347im)

You should note that g is a callable struct.

I just tried a much simpler case

p(x) = exp(x + x*im)
q(x) = real(p(x))
gradient(q, 1.3)

and that gives:
(-2.5540482783443026 - 4.517113399272802im,)

Then your g(x) probably contains a step that Zygote can’t differentiate through. Could you post how g looks like?

It’s a trivial cubic spline:

struct CSpline{Tx,Ty}
    x::AbstractArray{Tx,1}
    y::AbstractArray{Ty,1}
    D::AbstractArray{Ty,1}
end

# make  broadcast like a scalar
Broadcast.broadcastable(c::CSpline) = Ref(c)

function CSpline(x, y)
    R = similar(y)
    R[1] = y[2] - y[1]
    for i in 2:(length(y)-1)
        R[i] = y[i+1] - y[i-1]
    end
    R[end] = y[end] - y[end - 1]
    @. R *= 3
    d = fill(4.0, size(y))
    d[1] = 2.0
    d[end] = 2.0
    dl = fill(1.0, length(y) - 1)
    M = LinearAlgebra.Tridiagonal(dl, d, dl)
    D = M \ R
    CSpline(x, y, D)
end

function (c::CSpline)(x0)
    if x0 <= c.x[1]
        i = 2
    elseif x0 >= c.x[end]
        i = length(c.x)
    else
        i = findfirst(x0 .< c.x)
    end
    t = (x0 - c.x[i - 1])/(c.x[i] - c.x[i - 1])
    c.y[i - 1] + c.D[i - 1]*t + (3*(c.y[i] - c.y[i - 1]) - 2*c.D[i - 1] - c.D[i])*t^2 + (2*(c.y[i - 1] - c.y[i]) + c.D[i - 1] + c.D[i])*t^3
end

and then I create g with (note this is an artificial test)

A_x = [1.0, 1.7, 8.0, 9.7, 10.3, 12.5, 32]
A = [4.0, 2.7, 1.0, 0.7, 4.3, 17.4, 43]   
g = CSpline(A_x, A .+ im.*A./3.0)

But I believe the answer in my previous post is also incorrect (the derivative of the real function should be real, but it provided the full complex derivative of the nested function. Or am I missing some basic mathematics here?

Your function contains mutation of arrays, which Zygote can’t just differentiate through. You could take a look at Zygote.Buffer and see if that works for your case. The best solution might also be to just implement your own custom adjoint. BTW, your struct CSpline still contains abstract types, so it can’t be stack-allocated. Try

struct CSpline{Tx,Ty,Vx<:AbstractVector{Tx},Vy<:AbstractVector{Ty}}
    x::Vx
    y::Vy
    D::Vy
end

instead.

1 Like

But the actual called function I want the derivative of doesn’t mutate the arrays. Is it still a problem?

OK, I switched to StaticArrays and it now works!
Thanks you!

1 Like

Yes, Zygote doesn’t handle arrays all that well right now, even if they’re just used as constants. Glad to hear that you got it to work though!

1 Like