Laplacian for a minimized function

I’m trying to evaluate the following laplacian(b).

using Optim
using ForwardDiff

function f(a,b)
    sin(a[1])*(cos(a[2]) + cos(b[1]+b[2]))
end

function g(b)
    h(a) = f(a,b)
    res = optimize(h,zeros(2))
    return Optim.minimum(res)
end

function laplacian(b)
    tr(ForwardDiff.hessian(g,[b[1],b[2]]))
end

f(a,b) and g(b) work but laplacian(b) dosen’t.

MethodError: no method matching Float64(::ForwardDiff.Dual{ForwardDiff.Tag{typeof(g), Int64}, ForwardDiff.Dual{ForwardDiff.Tag{typeof(g), Int64}, Float64, 2}, 2})

How should I change the code?
Any help would be greatly appreciated. Thanks!

The issue is that the optimization starts from zeros(2), which is implicitly equivalent to zeros(Float64, 2). If you want ForwardDiff to be able to differentiate through your optimization algorithm, you have to allow arbitrary types everywhere, like this:

function g(b::Vector{T}) where {T}
    h(a) = f(a, b)
    res = optimize(h, zeros(T, 2))
    return Optim.minimum(res)
end

I checked and the Laplacian works now.
However, can I ask if your actual problem really looks like that MWE? Depending on the size of the input space and the autodiff mode (forward or reverse), you may want to take a look at other packages like DiffOpt.jl or ImplicitDifferentiation.jl.

Thank you for your reply. I checked the Laplacian works.

My actual setup is the following.

function R(μ,E)
    function dR(θ) 
        T =  exp(μ[1]+im*θ[1]) + exp(-μ[1]-im*θ[1]) + exp(μ[2]+im*θ[2]) + exp(-μ[2]-im*θ[2])
        Td = (exp(μ[1]+im*θ[1]) + exp(-μ[1]-im*θ[1]))*(exp(μ[2]+im*θ[2]) + exp(-μ[2]-im*θ[2]))
        Γ = exp(μ[1]+im*θ[1]) - exp(-μ[1]-im*θ[1]) + exp(μ[2]+im*θ[2]) - exp(-μ[2]-im*θ[2])
        return (1/2π)*(1/2π)*log(abs(t*T+td*Td+γ*Γ-E[1]-im*E[2]))
    end
    hcubature(dR,[0,0],[2π,2π],rtol=1e-5)[1]
end

function Φ(E::Vector{T}) where {T}
    Ronkin(μ) = R(μ,E)
    res = optimize(Ronkin,zeros(T,2))
    return Optim.minimum(res)
end

function DOS(E)
    tr(ForwardDiff.hessian(Φ,E))
end

here t=1.0, td=0.5, γ=0.2.
I need to evaluate DOS(E) for a specific region -L<E[1],E[2}<L.
I’m not familiar with autodiff. What approach is appropriate for this kind of problem?
As far as I checked, the above DOS(E) did not return the correct value, while the difference approximation for Φ(E::Vector{T}) provided an insufficient but nearly accurate value.

If I understand correctly you only have 2 parameters? In that case, forward-mode autodiff (with ForwardDiff.jl) should be your best bet.

I’m a bit surprised that it does not return the “right” value, can you provide a fully runnable example (with imports and numerical values) where the results differ?

Yes, I have 2 parameters, E[1] and E[2], for DOS(E).
Here is my code.

using LinearAlgebra
using HCubature
using Optim
using ForwardDiff

function R(μ,E)
    function dR(θ) 
        T =  exp(μ[1]+im*θ[1]) + exp(-μ[1]-im*θ[1]) + exp(μ[2]+im*θ[2]) + exp(-μ[2]-im*θ[2])
        Td = (exp(μ[1]+im*θ[1]) + exp(-μ[1]-im*θ[1]))*(exp(μ[2]+im*θ[2]) + exp(-μ[2]-im*θ[2]))
        Γ = exp(μ[1]+im*θ[1]) - exp(-μ[1]-im*θ[1]) + exp(μ[2]+im*θ[2]) - exp(-μ[2]-im*θ[2])
        return (1/2π)*(1/2π)*log(abs(t*T+td*Td+γ*Γ-E[1]-im*E[2]))
    end
    hcubature(dR,[0,0],[2π,2π],rtol=1e-5)[1]
end

function Φ(E::Vector{T}) where {T}
    Ronkin(μ) = R(μ,E)
    res = optimize(Ronkin,zeros(T,2))
    return Optim.minimum(res)
end

function DOS1(E)
    (1/2π)*tr(ForwardDiff.hessian(Φ,E))
end

function DOS2(E)
    δx = δy = 0.1
    Exy = E
    Exp = E.+[δx,0]
    Exm = E.+[-δx,0]
    Eyp = E.+[0,δy]
    Eym = E.+[0,-δy]
    return (1/2π)*((Φ(Exp)-2Φ(Exy)+Φ(Exm))/(δx^2) + (Φ(Eyp)-2Φ(Exy)+Φ(Eym))/(δy^2))
end
t = 1.0
td = 0.5
γ = 0.2
E = [0.0,0.0]
DOS1(E),DOS2(E)
(-1.4135798584282297e-16, 0.14388277024025198)

It looks like either your optimization algorithm or your integration algorithm are nearly constant around your solution, which means the algorithmic derivatives will be close to zero. Gonna take a look

FYI, it may relate to the nature of R(μ,E). Namely, R(μ,E) could take a minimum in a certain region (not at a single point) as a function of (μ[1],μ[2]) for a fixed E, say E=E0. See this topic as an example.
The important features are

  1. When R(μ,E0) has a minimizer region, the integrand of R(μ,E0) is well-defined in there.
  2. When R(μ,E0) has a minimizer point, the integrand of R(μ,E0) shows a logarithmic singularity at that point.

although, in both cases, R(μ,E0) itself is well-defined and takes a finite value.

My observation tells that DOS1(E) doesn’t work when a minimizer point exists. For instance,

  1. At E0 = [0.0,0.0], which provides a minimizer point,
    (DOS1(E0),DOS2(E0))=(2.8271597168564594e-16, 0.1418357989357821).

  2. At E0 = [-3.0,3.0], which gives a minimizer region,
    (DOS1(E0),DOS2(E0))=(0.0, -0.00018425790497728805).

For a given E0, you can see whether R(μ,E0) has a minimizer point or a minimizer region by calculating Amoeba of the polynomial, p(z,w) = t*T+td*Td+γ*Γ-E0[1]-im*E0[2], where z = exp(μ[1]+im*θ[1]) and w = exp(μ[2]+im*θ[2]).

But then how do you interpret the result of finite differences in the case of a minimizer point? Does it make any more mathematical sense?

Roughly speaking, DOS(E) behaves as DOS(E) ~ Σ_i δ(E-Ei), here R(μ,Ei) gives a minimizer point and δ(E) is a 2d delta function. This means that if E=E0 corresponds to a minimizer region, DOS(E0) is zero. On the other hand, in the case of a minimizer point, DOS(E0) diverges. Therefore, the result of finite differences can be understood as the suppressed divergence by the discretization of Laplacian.
So it is expected that DOS1(E0) may not match DOS2(E0) but it returns non-zero (maybe quite large) value in the case of a minimizer point.

Actually, I’m trying to reproduce Fig.4(a) in this paper. See Eq.(7),(8),(18),(19),(20) and Sec.IV.