Gaussian process regression with Turing gets stuck

I wasn’t working on this problem for last couple days, but returned to it today and I don’t think that there is actually an issue in the way the model is defined. Instead, I think there is some problem in the gradient calculation.

If I change the lengthscale slightly the loglikelihood changes as well as shown below, so I think the gradient should exist. (Correct me if I’m wrong on this.)

Loglikelihood Computation
julia> function f(lengthscale)
           noise = 0.1
           gp = construct_finite_gp(Xe, first(lengthscale), noise)
           return logpdf(gp, ye)
       end
f (generic function with 1 method)

julia> f(0.9)
-158.09618544618976

julia> f(1.0)
-178.65913509460518

julia> f(1.1)
-205.45899882684503

Furthermore, with Zygote the gradient calculation actually works:

Gradient Calculation
julia> ForwardDiff.gradient(f, [1.])
1-element Vector{Float64}:
 NaN

julia> ReverseDiff.gradient(f, [1.])
1-element Vector{Float64}:
 NaN

julia> Zygote.gradient(f, [1.])
([-235.96836642578603],)

I don’t know why some AD backends work and some don’t. It may be some numerical issues or a bug in the backend packages?

Anyway, for completness;
The solution is to set the AD backend to Zygote by adding the following line:

Turing.setadbackend(:zygote)

I’ve tested the code and it runs fine with Zygote even with duplicate data points.