Is there a differentiable implementation of matrix square root?

marcsgil · March 30, 2024, 6:10pm

Hello!

I’m trying to make the following code work:

using ForwardDiff, LinearAlgebra

# Fidelity between two density matrices
function fidelity(ρ::AbstractMatrix, σ::AbstractMatrix)
    sqrt_ρ = sqrt(ρ)
    abs2(tr(sqrt(sqrt_ρ * σ * sqrt_ρ)))
end

# Matrix representation of a Bloch vector
function matrix_representation(r)
    [(1+r[1]) (r[2]-r[3]im); (r[2]+r[3]im) (1-r[1])] ./ 2
end

# Fidelity between a density matrix and a Bloch vector
function fidelity(ρ::AbstractMatrix, r::AbstractVector)
    fidelity(ρ, matrix_representation(r))
end

# Gradient of the fidelity
function ∇fidelity(ρ::AbstractMatrix, r::AbstractVector)
    ForwardDiff.gradient(r -> fidelity(ρ, r), r)
end

r = [0, 0, 0]
ρ = matrix_representation(r)
∇fidelity(ρ, r)

In it, I attempt to calculate the gradient of the fidelity between two quantum states, differentiating with respect to the Bloch vector of one of them. Unfortunately, I get the error

ERROR: MethodError: no method matching eigen!(::Hermitian{Complex{ForwardDiff.Dual{…}}, Matrix{Complex{…}}}; sortby::Nothing)

which I believe means that the sqrt(::AbstractMatrix) method is not differentiable by ForwardDiff.

Could someone point me to a differentiable implementation of such a method, or propose a workaround?

Obs: This 2\times2 case is only an example, and I actually need a method that works on arbitrary dimension.

Thanks!

gdalle · March 31, 2024, 8:24am

You can take a look at Enzyme.jl or DifferentiableFactorizations.jl, although I’m unsure how either behaves with complex input

stevengj · March 31, 2024, 2:30pm

Since you only need a 2x2 matrix square root, you can use the analtyical formula, which should be differentiable by ForwardDiff etc. This will be much more efficient anyway than forming a generic (albeit Hermitian) matrix and taking the square root (via eigenvalues).

Actually, this 2x2 formula is already included in StaticArrays.jl, so you can just use an SMatrix — which you should probably be using anyway for such small fixed-size matrices — and it should work (and be much faster).

Changing your code to:

using StaticArrays
function matrix_representation(r)
    @SMatrix[(1+r[1]) (r[2]-r[3]im); (r[2]+r[3]im) (1-r[1])] ./ 2
end

gives

julia> ∇fidelity(ρ, [0,0,0])
3-element Vector{Float64}:
 0.0
 0.0
 0.0

julia> ∇fidelity(ρ, [0.1,0.2,0.3])
3-element Vector{Float64}:
 -0.053916386601719206
 -0.10783277320343838
 -0.1617491598051576

which matches a finite-difference check:

julia> r = [0.1,0.2,0.3]; dr = randn(3) * 1e-8;

julia> isapprox(fidelity(ρ,r+dr) - fidelity(ρ,r), ∇fidelity(ρ,r) ⋅ dr, rtol=1e-5)
true

marcsgil · March 31, 2024, 2:58pm

When I wrote the post, I realized that I didn’t make it explicit that the 2\times2 case was only an example, and that I actually need it to work in arbitrary dimensions. I thought I had it edited to include an observation stating that, but I must have forgotten to save it. Anyway, thank you for the response, I didn’t know that StaticArrays had those optimized methods! Happy Easter!!

stevengj · March 31, 2024, 4:48pm

If you are in high dimensions (with correspondingly lots of parameters), then you probably don’t want forward-mode AD (ala ForwardDiff.jl), as in that case the cost of the gradient scales with the function cost times the number of parameters. Instead, you want reverse-mode AD (ala Zygote.jl, ReverseDiff.jl, or Enzyme.jl), as in that case the cost of the gradient scales with the function cost, independent of the number of parameters.

ChainRules.jl (used by Zygote.jl) already has a rule for differentiating the sqrt of a Hermitian matrix. (Don’t forget to wrap your matrix in Hermitian.)

wsmoses · March 31, 2024, 10:15pm

See here for a discussion of Complex numbers in Enzyme: FAQ · Enzyme.jl

Topic		Replies	Views
Derivative of eigenvalues and eigenvectors of Hermitian Matrix by automatic differentiation General Usage question	19	4611	June 11, 2018
Automatic differentiation of f(x) constructed by decomposition of a matrix M(x) Optimization (Mathematical) differentiation , flux , machine-learning , numerics	7	2916	January 8, 2020
Using ForwardDiff on qr factorization Optimization (Mathematical) linearalgebra , forwarddiff	11	885	July 7, 2020
Matrix square root (Error) General Usage package	5	5086	March 31, 2019
Eigenvalue sensitivites using autodiff Numerics eigenvalues , autodiff	5	883	May 2, 2022

Is there a differentiable implementation of matrix square root?

Related topics