Numerical computation density function from distribution with autodifferentiation

Kieran_Marray · March 7, 2024, 2:42pm

I have a non-standard multivariate distribution function/copula (here, a copula in the MWE). I am trying to compute the density function by taking the cross-partial derivatives of the distribution/copula with respect to each variable. But I am getting some values that are not between 0 and 1. Here is a minimum working example using a bivariate Gumbel copula

using FiniteDiff, Plots, LinearAlgebra

Cop(x, θ) = exp(-((-log(x[1]))^(θ) + (-log(x[2]))^(θ))^(1/θ)) # bivariate gumbel copula as example
Cop_fixed(x) = Cop(x,3)
cross_partial = []
for i in 0.01:0.01:0.999, j in 0.01:0.01:0.999 # using small grid
    append!(cross_partial, FiniteDiff.finite_difference_hessian(Cop_fixed,[i,j])[1,2])
end

plot(collect(0.01:0.01:0.999), collect(0.01:0.01:0.999), cross_partial, st=:surface)
xflip!(true)

A lot of the values look correct, but I get some values greater than 1… Anyone have any idea why?

gdalle · March 7, 2024, 2:51pm

Why do you expect these values \partial f / \partial x_1 \partial x_2 to always be positive?

Kieran_Marray · March 7, 2024, 3:23pm

Whoops - error in the MWE. Corrected here. For the given copula, the density should equal the cross-partial derivative

gdalle · March 7, 2024, 3:25pm

A density function does not have to be between 0 and 1, so if I understand your example correctly it doesn’t seem to be a problem

Kieran_Marray · March 7, 2024, 3:32pm

Ahh yes I clearly need to take a break. Area also integrates to approx 1. Thanks for correcting my stupid error…

gdalle · March 7, 2024, 3:40pm

No worries! Densities are tricky beasts

lrnv · March 7, 2024, 3:55pm

However your finite diffrences are not that good :

using FiniteDiff, Plots, LinearAlgebra, Copulas, Distributions

Cop(x, θ) = exp(-((-log(x[1]))^(θ) + (-log(x[2]))^(θ))^(1/θ)) # bivariate gumbel copula as example
Cop_fixed(x) = Cop(x,3)
cross_partial = []

G = GumbelCopula(2,3)
truth = []

for i in 0.01:0.01:0.999, j in 0.01:0.01:0.999 # using small grid
    append!(cross_partial, FiniteDiff.finite_difference_hessian(Cop_fixed,[i,j])[1,2])
    append!(truth, pdf(G,[i,j]))
end

relative_error = (cross_partial .- truth)./truth
plot(collect(0.01:0.01:0.999), collect(0.01:0.01:0.999), relative_error, st=:surface)
xflip!(true)

plot_14

with:

julia> maximum(abs.(relative_error))
0.0001512309848833021

julia> maximum(abs.(cross_partial .- truth))
0.00961047447613339

julia>

Why FiniteDiff and not ForwardDiff ?

Kieran_Marray · March 7, 2024, 4:06pm

FiniteDiff actually seemed to perform better - I figured that the forward mode in ForwardDiff would make it perform especially badly around the boundary… Is there a better autodiff mode to use for this kind of problem?

lrnv · March 7, 2024, 4:10pm

“Performs better” in term of runtime or error ?

I think it really depends on your problem, since this gumbel is just a MWE. But my default option is to go with autodiff and not finitediff, which is why I was suprised.

gdalle · March 7, 2024, 5:18pm

To clarify, algorithmic differentiation (eg with ForwardDiff or Enzyme) makes no floating point cancellation/truncation errors. And it is usually as fast as numerical differentiation (eg with FiniteDiff) in forward mode, possibly much faster in reverse mode.
Here we definitely want forward mode cause the input dimension is small

gdalle · March 7, 2024, 5:22pm

And in general for small vectors and matrices like these, you will get an insane speed boost from using StaticArrays.jl instead of normal arrays

ChrisRackauckas · March 7, 2024, 6:59pm

No, not true. It can even have unbounded error. But same with finite differencing. It does tend to have a lot less error though.

It should usually be faster as the derivative code is generally simpler that the primal, and being able to SIMD the two together can generally make the derivative pushforward less than the 2x it would otherwise take.

gdalle · March 7, 2024, 7:26pm

Let me correct: it makes no floating point errors inherent to the differentiating method. Whereas the whole point of finite differences is to compute f(x+h) - f(x) / h and thus strike a balance between Taylor residual (h large) and floating point errors (h small). Autodiff has no h, that’s what I meant

gdalle · March 7, 2024, 7:29pm

My formulation was influenced by the book I’m currently reading ^^ (Griewank & Walther 2008)

Kieran_Marray · March 8, 2024, 9:43am

So, in general, is there a preferred autodifferentiation method for best results around the boundary of the support of a function like the edges of the cdf here?

gdalle · March 8, 2024, 10:36am

I’m not sure about this specific case, but for this input dimension ForwardDiff.jl is your best choice in general

lrnv · March 8, 2024, 10:44am

I second this, ForwardDiff is great.

In the particular Gumbel case, this is exactly what Copulas.jl is doing right now. The derivative of Gumbel’s generator is not that hard to compute by hand, and adding it there would improve performance, but since it is not there the default is ForwardDiff.