Why is this FiniteDiff derivative wrong?

What is going on here?

using FiniteDiff

randmat = rand(10, 2)
sto = similar(randmat)

function claytonsample!(sto, τ; randmat=randmat)
    sto .= randmat
    τ == 0 && return sto

    n = size(sto, 1)
    for i in 1:n
        v = sto[i, 2]
        u = sto[i, 1]
        sto[i, 2] = (1 - u^(-τ) + u^(-τ)*v^(-(τ/(1 + τ))))^(-1/τ)
    return sto

julia> FiniteDiff.finite_difference_derivative(τ -> claytonsample!(sto, τ; randmat=randmat), 0.3)
10×2 Matrix{Float64}:
 0.0  0.0
 0.0  0.0
 0.0  0.0
 0.0  0.0
 0.0  0.0
 0.0  0.0
 0.0  0.0
 0.0  0.0
 0.0  0.0
 0.0  0.0

but this works:

function claytonsample(τ; randmat=randmat)
    claytonsample!(similar(randmat), τ; randmat=randmat)

julia> FiniteDiff.finite_difference_derivative(τ -> claytonsample(τ; randmat=randmat), 0.3)
10×2 Matrix{Float64}:
 0.0   0.129944
 0.0  -0.32573
 0.0   0.00237472
 0.0   0.0958462
 0.0   0.0604854
 0.0  -0.353726
 0.0   0.0872109
 0.0  -0.0146269
 0.0  -0.0250762
 0.0   0.0257383


Finite differencing has needs to evaluate the function at a couple points tau, but since each evaluation modifies the same matrix sto, the answers from the evaluations all come out the same, so you get a derivative of zero.


I wonder why this works then:

using Optim
using FiniteDiff

randmat = rand(10, 2)
sto = similar(randmat)

function claytonsample!(sto, τ; randmat=randmat)
    sto .= randmat
    τ == 0 && return sto

    n = size(sto, 1)
    for i in 1:n
        v = sto[i, 2]
        u = sto[i, 1]
        sto[i, 2] = (1 - u^(-τ) + u^(-τ)*v^(-(τ/(1 + τ))))^(-1/τ)
    return sto

function corr(X)
    x = @view(X[:, 1])
    y = @view(X[:, 2]) 
    sum(x .* y)/length(x) - sum(x) * sum(y)/length(x)^2

function obj(sto, τ; randmat=randmat)
    τ′ = abs(τ) - 1
    claytonsample!(sto, τ′; randmat=randmat)

opt = optimize(τ -> obj(sto, τ[1]), [5.0, 5.0], BFGS(); autodiff=:finite)

In words, I’m minimizing the correlation between columns in matrix sto. In theory, that is achieved with a τ of 0. Since the finite diff derivative is incorrect, I’m surprised that it finds the minimizing value of τ. (Notice that I am tricking Optim into optimizing a univariate function, so disregard the second component of the solution vector).

Well, that’s a new function you’re differentiating and it does that one correctly because you use sto before returning it, so even if sto gets modified later by another call, it doesn’t matter.