Choosing a convention for complex numbers in DifferentiationInterface

gdalle · January 4, 2025, 7:30pm

Hi everyone,

As DifferentiationInterface.jl gains more traction, I find myself forced to confront the issue of AD with complex numbers. Right now they are not officially supported, but they should be and I wonder what the community needs.

For instance, the derivative of a function \mathbb{C} \to \mathbb{C} can be represented as a matrix in \mathbb{R}^{2 \times 2}, or by a couple in \mathbb{C}^2, but not by a single number (except in the holomorphic case). I need to make such a decision for each operator, i.e. fill the following table and figure out for each cell:

Should I error or return something?
What is the correct convention for the result?

operator	\mathbb{R} \to \mathbb{C}	\mathbb{C} \to \mathbb{R}	\mathbb{C} \to \mathbb{C}
`DI.derivative` (scalar to scalar)	?	?	?
`DI.derivative` (scalar to array)	?	?	?
`DI.gradient` (array to scalar)	?	?	?
`DI.jacobian` (array to array)	?	?	?
`DI.hessian` (array to scalar)	?	?	?

I also want to figure out how holomorphic behavior should be signaled to DI, if it can be exploited. Perhaps a function wrapper, or a keyword argument à la JAX?

Comments, wishes, pointers to relevant literature are all appreciated. Tagging a few opinionated people: @ChrisRackauckas @oxinabox @Mason @wsmoses @willtebbutt @avikpal

Related discussions:

github.com/JuliaDiff/DifferentiationInterface.jl

Complex number support

opened 03:21PM - 29 Nov 24 UTC

ErikQQY

core

Find this while implementing https://github.com/SciML/BoundaryValueDiffEq.jl/pul…l/258 This MWE is just proof of the idea and may not be meaningful, but I think it can still prove that the current DI lacks support for sparse Jacobian of complex numbers. ```julia using ForwardDiff, DifferentiationInterface, SparseMatrixColorings, ADTypes const DI = DifferentiationInterface #backend = AutoForwardDiff() # this works backend = AutoSparse( AutoForwardDiff(), sparsity_detector = ADTypes.KnownJacobianSparsityDetector(ones(3, 3)), coloring_algorithm = ConstantColoringAlgorithm(ones(3, 3), ones(Int64, 3)) ) u0 = [1.0; 0.0; 0.0] .+1im jac_cache = DI.prepare_jacobian(nothing, similar(u0), backend, u0) ``` stack trace ```julia ERROR: MethodError: no method matching DifferentiationInterfaceSparseMatrixColoringsExt.PushforwardSparseJacobianPrep(::DifferentiationInterface.BatchSizeSettings{…}, ::SparseMatrixColorings.ColumnColoringResult{…}, ::Matrix{…}, ::Vector{…}, ::Vector{…}, ::DifferentiationInterfaceForwardDiffExt.ForwardDiffTwoArgPushforwardPrep{…}) Closest candidates are: DifferentiationInterfaceSparseMatrixColoringsExt.PushforwardSparseJacobianPrep(::BS, ::C, ::M, ::S, ::R, ::E) where {BS<:DifferentiationInterface.BatchSizeSettings, C<:(AbstractColoringResult{:nonsymmetric, :column}), M<:(AbstractMatrix{<:Real}), S<:(AbstractVector{<:Tuple{Vararg{T, N}} where {N, T}}), R<:(AbstractVector{<:Tuple{Vararg{T, N}} where {N, T}}), E<:DifferentiationInterface.PushforwardPrep} @ DifferentiationInterfaceSparseMatrixColoringsExt ~/.julia/packages/DifferentiationInterface/bulUW/ext/DifferentiationInterfaceSparseMatrixColoringsExt/jacobian.jl:11 Stacktrace: [1] _prepare_sparse_jacobian_aux_aux(::DifferentiationInterface.BatchSizeSettings{…}, ::SparseMatrixColorings.ColumnColoringResult{…}, ::Vector{…}, ::Tuple{…}, ::AutoSparse{…}, ::Vector{…}) @ DifferentiationInterfaceSparseMatrixColoringsExt ~/.julia/packages/DifferentiationInterface/bulUW/ext/DifferentiationInterfaceSparseMatrixColoringsExt/jacobian.jl:107 [2] _prepare_sparse_jacobian_aux(::DifferentiationInterface.PushforwardFast, ::Vector{…}, ::Tuple{…}, ::AutoSparse{…}, ::Vector{…}) @ DifferentiationInterfaceSparseMatrixColoringsExt ~/.julia/packages/DifferentiationInterface/bulUW/ext/DifferentiationInterfaceSparseMatrixColoringsExt/jacobian.jl:81 [3] prepare_jacobian(::Nothing, ::Vector{…}, ::AutoSparse{…}, ::Vector{…}) @ DifferentiationInterfaceSparseMatrixColoringsExt ~/.julia/packages/DifferentiationInterface/bulUW/ext/DifferentiationInterfaceSparseMatrixColoringsExt/jacobian.jl:49 [4] top-level scope @ ~/Random/test2.jl:27 Some type information was truncated. Use `show(err)` to see complete types. ``` I believe the culprit is the annotation of the compressed matrix being too restricted in the SparseMatrixColorings extensions, I locally changed this into more generic ones and the errors are gone. https://github.com/JuliaDiff/DifferentiationInterface.jl/blob/cc8818a2bb0fb3dab2abf29ba213f89213a8613a/DifferentiationInterface/ext/DifferentiationInterfaceSparseMatrixColoringsExt/jacobian.jl#L3-L33

stevengj · January 4, 2025, 8:02pm

I’m partial to the CR calculus approach where you use a pair of complex numbers \partial f / \partial z and \partial f / \partial \bar{z}.

I find this easiest to use in the optimization context where you have functions \to \mathbb{R} in the end (in which case the two derivatives are conjugates), because it maps very easily onto a gradient that can be passed to optimization software. It also simplifies nicely in the holomorphic case (where the antilinear term \partial f / \partial \bar{z} = 0), and allows you to still think about things as complex numbers (rather than pairs of real and imaginary parts, which is often conceptually awkward).

But I’m not sure how this meshes with the internals of AD systems.

rveltz · January 4, 2025, 8:06pm

I concour!

Also I am not sure if you should add in your table the JVP of a real function applied to complex vectors. I need this in bifurcationKit and use dispatch to handle this for now.

ChrisRackauckas · January 7, 2025, 8:46am

Actually using this would require a special Newton method and compiler analysis that do not currently exist. What we would need is:

Something in FunctionProperties.jl/DI for isholomorphic. This in theory can be done by constant propogation of the zeros through chain rules, where if the rules were writteen in the CR calculus approach then you would maybe be able to analyze whether you get a structural zero in the derivative of zbar.
Functionality in NonlinearSolve.jl, where if isholomorphic then create and use a Jacobian of a different size, and do the alternative Newton based on that property.

I’ve wanted such a thing for almost a decade now, but making sure we can properly declare something as holomorphic has been the hard part. The other thing that could be done is it could just be a user-set trait, that defaults to not holomorphic, and thus make the Newton default to the 2x2 form. The downside there is that a lot of “standard” ODE cases would then take a pretty big performance hit, and pretty much no one would know to actually set the trait.

So for now, we do the holomorphic Newton method simply because the holomorphic Newton requires 0 code changes, if you slap complex numbers into a standard Newton code it’s what you get, and if the user wants the other case they should just write their nonlinear solve using a real-valued u0 with the real and imaginary parts, and then inside of their f construct the complex numbers. This path of “try the simple thing, error often, but let the user solve this on their own if they want with a simple workaround” has been the okay equilibrium we’ve sat in.

Topic		Replies	Views
Taking Complex Autodiff Seriously in ChainRules Specific Domains numerics , chainrulescore , complex-numbers	65	6281	July 10, 2020
How to forward differentiate a complex function? General Usage question , forwarddiff , autodiff , complex-numbers	19	359	June 24, 2025
Automatic differentiation of complex functions: simple workaround Specific Domains differentiation	6	2316	October 7, 2023
Automatic differentiation of complex valued functions Numerics question , zygote , forwarddiff , complex-numbers	30	4822	November 1, 2019
Complex-valued functions with real arguments - ForwardDiff, JuliaDiff Numerics	6	1595	September 28, 2020

Choosing a convention for complex numbers in DifferentiationInterface

Related topics