# TemporalGPs precise zero estimate

Hi @willtebbutt I’m trying to use TemporalGPs and I wonder if you might have a moment to address an issue I’m having.

The data have about 460 observations for each of 15 values of `x`, for a total of about 6900 observations. Taking the mean of the `y` values at each `x` looks like this:

``````using AlgebraOfGraphics, Chain, CairoMakie

@chain d begin
groupby(:x)
combine(nrow, :y .=> [mean, std])
# data(_) * mapping(:x, :y_mean; markersize=:nrow => (x->x/30))
# draw
end

15×4 DataFrame
Row │ x      nrow   y_mean        y_std
│ Int64  Int64  Float64       Float64
─────┼─────────────────────────────────────
1 │     0    448   0.170807     1.93473
2 │     1    457  -0.201122     2.03551
3 │     2    471  -0.0526347    2.02754
4 │     3    458  -0.0742196    2.08856
5 │     4    458  -0.0913503    1.90172
6 │     5    460   0.0391391    1.89031
7 │     6    450   0.0907834    1.8109
8 │     7    465   0.140631     1.87325
9 │     8    458  -0.0287554    1.81442
10 │     9    444   0.0512044    1.82845
11 │    10    461  -0.000119172  2.01421
12 │    11    473  -0.149982     1.78168
13 │    12    467  -0.0305379    1.87575
14 │    13    467   0.127604     1.86084
15 │    14    480  -0.09409      1.98472

``````

Using Matern kernels with BFGS produces extremely small estimates, with means at 1e-15 and stds at 1e-7. Based on the graph above a small estimate makes sense, but such a highly precise estimate at zero seems extreme. I tried a few different starting values and a few different Matern kernels but they all produced the same result.

Do you agree this seems like an overly precise zero? How can I address it?

``````julia> θ
(s_kernel = 7.223778766269905e-8, λ = 4.53758447041918, s_noise = 3.6792196321922757)
``````
``````using CSV, DataFrames, TemporalGPs, AbstractGPs, KernelFunctions, ParameterHandling, Optim, Zygote

d = CSV.read("diagnostic_timeseries.csv", DataFrame) .|> identity

kern(params) = params.s_kernel^2 * (params.λ * Matern52Kernel() + ConstantKernel())

build_model(params) = to_sde(GP(kern(params)), SArrayStorage(Float64))

function handlexy(df, θ₀)
nlml = (θ_flat) -> let
θ = unpack(θ_flat)
f = build_model(θ)
-logpdf(f(df.x, θ.s_noise), df.y)
end

θ_flat_init, unflatten = ParameterHandling.flatten(θ₀)
unpack(θ_flat::Vector{Float64}) = ParameterHandling.value(unflatten(θ_flat))

# check that it works
@show(nlml(θ_flat_init))
local results
try
results = @time Optim.optimize(nlml, θ->gradient(nlml, θ)[1], θ_flat_init, BFGS(), Optim.Options(show_trace=false,); inplace=false)
catch e
if e isa DomainError || e isa ArgumentError
return nothing
else
rethrow()
end
else
return θ_bfgs = unpack(results.minimizer)
end
end

θ₀ = (s_kernel=positive(.1), λ=positive(1/10), s_noise=positive(2.))
θ = handlexy(d, θ₀)
post = posterior(build_model(θ)(d.x, θ.s_noise), d.y)
plot_xs = 0:Rational(1/100):maximum(d.x)
plot_ms = marginals(post(plot_xs))
plot_means = mean.(plot_ms)
plot_stds = std.(plot_ms)
θ
``````

Data:

Thanks for opening the issue.

It looks to me like it’s operating in a correct manner, but I agree that I’m not sure whether it’s the model you want.

Based on the hyperparameters that the model has learned, `(s_kernel = 7.223778766269905e-8, λ = 4.53758447041918, s_noise = 3.6792196321922757)`, it looks like the GP is entirely ignored (`s_kernel` is very small), and the noise is used to explain all of the data (`s_noise` is comparatively very large).

This means that if you ask “what value will the GP take” for any input, the answer everywhere will be close to 0 with high probability. If you replace `marginals(post(plot_xs))` with `marginals(post(plot_xs, θ.s_noise))` you’ll get what the model thinks it will see at `plot_xs`, which is just going to be a nearly-zero-mean Gaussian with variance `θ.s_noise^2`.

To summarise, I think what’s going on is that your model hasn’t been able to find any useful structure with which to make predictions. I would maybe try playing around with your kernel and the lengthscale initialisation – it might be that you’re able to pick up on some structure by doing so. Perhaps try a rougher kernel like the `Matern12Kernel` or `Matern32Kernel` – possibly the `Matern52Kernel` is a bit too smooth for your data or something.

1 Like

Trying 15 different combinations of kernels and initial lengthscales, two of them fail with an error and the rest of the `s_kernel` are all very small.

Is there something else I should try, or just conclude the model finds no variation over x values?

If I don’t catch the error it looks like this which I don’t really understand. Why would `sqrt` be getting a negative number?

Code:

``````function handlexy(df, θ₀, basekernel)
kernel(params) = params.s_kernel^2 * (params.λ * basekernel + ConstantKernel())

build_model(params) = to_sde(GP(kernel(params)), SArrayStorage(Float64))

nlml(θ_flat) = let
θ = unpack(θ_flat)
f = build_model(θ)
-logpdf(f(df.x, θ.s_noise), df.y)
end

θ_flat_init, unflatten = ParameterHandling.flatten(θ₀)
unpack(θ_flat::Vector{Float64}) = ParameterHandling.value(unflatten(θ_flat))

# check that it works
@show(nlml(θ_flat_init))
local results
try
results = @time Optim.optimize(nlml, θ->gradient(nlml, θ)[1], θ_flat_init, BFGS(), Optim.Options(show_trace=false,); inplace=false)
catch e
if e isa DomainError || e isa ArgumentError
return nothing
else
rethrow()
end
else
return θ_bfgs = unpack(results.minimizer)
end
end

ls = [1,3,5,7,9]
θ₀s = map(l->(s_kernel=positive(.1), λ=positive(1/l), s_noise=positive(2.)), ls)
basekernels = [Matern12Kernel(), Matern32Kernel(), Matern52Kernel()]
θs = @showprogress map(Base.splat(handlexy), Iterators.product([d], θ₀s, basekernels))
``````

Result:

``````julia> [θ for θ in θs if !isnothing(θ)]
13-element Vector{NamedTuple{(:s_kernel, :λ, :s_noise), Tuple{Float64, Float64, Float64}}}:
(s_kernel = 1.4901161193847656e-8, λ = 1.0439546690906695e13, s_noise = 3.6770055623609044)
(s_kernel = 6.014543554000831e-7, λ = 6.407922242514216e9, s_noise = 3.677005563380498)
(s_kernel = 2.6202221112199158e-6, λ = 4.667631200331745, s_noise = 3.6792196321448594)
(s_kernel = 5.99181758737612e-8, λ = 6.437847689693439, s_noise = 3.6792196321923574)
(s_kernel = 1.6511956718658313e-8, λ = 0.9626660078398387, s_noise = 3.679219622454649)
(s_kernel = 2.293699888145554e-8, λ = 10.812077021350452, s_noise = 3.6792196321923134)
(s_kernel = 3.0405739544690883e-8, λ = 4.803021431206891, s_noise = 3.6792196321925354)
(s_kernel = 2.0420200755515376e-8, λ = 2.4981775446042214, s_noise = 3.6792196321923214)
(s_kernel = 2.0726151863253984e-8, λ = 0.9669139968260427, s_noise = 3.679219632193829)
(s_kernel = 3.202352488984103e-8, λ = 10.758448834666707, s_noise = 3.6792196321927104)
(s_kernel = 7.223778766269905e-8, λ = 4.53758447041918, s_noise = 3.6792196321922757)
(s_kernel = 2.4556263826700986e-8, λ = 3.194522400554354, s_noise = 3.679219632192327)
(s_kernel = 1.4901161193847656e-8, λ = 7.500396019759766e12, s_noise = 3.677621795619176)
``````

If I don’t catch the error:

``````julia> handlexy(d, (s_kernel=positive(.1), λ=positive(1/3), s_noise=positive(2.)), Matern12Kernel())
nlml(θ_flat_init) = 15113.795852750627
ERROR: DomainError with -1.332905371020364:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).
Stacktrace:
[1] throw_complex_domainerror(f::Symbol, x::Float64)
@ Base.Math ./math.jl:33
[2] sqrt
@ ./math.jl:591 [inlined]
[3] posterior_and_lml(x::TemporalGPs.Gaussian{StaticArraysCore.SVector{2, Float64}, StaticArraysCore.SMatrix{2, 2, Float64, 4}}, f::TemporalGPs.ScalarOutputLGC{LinearAlgebra.Adjoint{Float64, StaticArraysCore.SVector{2, Float64}}, Float64, Float64}, y::Float64)
@ TemporalGPs ~/.julia/packages/TemporalGPs/NtnD2/src/models/linear_gaussian_conditionals.jl:292
[4] step_logpdf
@ ~/.julia/packages/TemporalGPs/NtnD2/src/models/lgssm.jl:171 [inlined]
[5] step_logpdf
@ ~/.julia/packages/TemporalGPs/NtnD2/src/models/lgssm.jl:167 [inlined]
[6] _pullback(#unused#::Zygote.Context, #unused#::typeof(TemporalGPs.scan_emit), f::typeof(TemporalGPs.step_logpdf), xs::Base.Iterators.Zip{Tuple{TemporalGPs.LGSSM{TemporalGPs.GaussMarkovModel{TemporalGPs.Forward, Vector{StaticArraysCore.SMatrix{2, 2, Float64, 4}}, FillArrays.Fill{FillArrays.Zeros{Float64, 1, Tuple{Base.OneTo{Int64}}}, 1, Tuple{Base.OneTo{Int64}}}, Vector{StaticArraysCore.SMatrix{2, 2, Float64, 4}}, TemporalGPs.Gaussian{StaticArraysCore.SVector{2, Float64}, StaticArraysCore.SMatrix{2, 2, Float64, 4}}}, StructArrays.StructVector{TemporalGPs.ScalarOutputLGC{LinearAlgebra.Adjoint{Float64, StaticArraysCore.SVector{2, Float64}}, Float64, Float64}, NamedTuple{(:A, :a, :Q), Tuple{FillArrays.Fill{LinearAlgebra.Adjoint{Float64, StaticArraysCore.SVector{2, Float64}}, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float64}}}, Int64}}, Vector{Float64}}}, init_state::TemporalGPs.Gaussian{StaticArraysCore.SVector{2, Float64}, StaticArraysCore.SMatrix{2, 2, Float64, 4}}, idx::UnitRange{Int64})
@ TemporalGPs ~/.julia/packages/TemporalGPs/NtnD2/src/util/scan.jl:43
[7] _pullback
@ ~/.julia/packages/TemporalGPs/NtnD2/src/models/lgssm.jl:164 [inlined]
[8] _pullback(::Zygote.Context, ::typeof(logpdf), ::TemporalGPs.LGSSM{TemporalGPs.GaussMarkovModel{TemporalGPs.Forward, Vector{StaticArraysCore.SMatrix{2, 2, Float64, 4}}, FillArrays.Fill{FillArrays.Zeros{Float64, 1, Tuple{Base.OneTo{Int64}}}, 1, Tuple{Base.OneTo{Int64}}}, Vector{StaticArraysCore.SMatrix{2, 2, Float64, 4}}, TemporalGPs.Gaussian{StaticArraysCore.SVector{2, Float64}, StaticArraysCore.SMatrix{2, 2, Float64, 4}}}, StructArrays.StructVector{TemporalGPs.ScalarOutputLGC{LinearAlgebra.Adjoint{Float64, StaticArraysCore.SVector{2, Float64}}, Float64, Float64}, NamedTuple{(:A, :a, :Q), Tuple{FillArrays.Fill{LinearAlgebra.Adjoint{Float64, StaticArraysCore.SVector{2, Float64}}, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float64}}}, Int64}}, ::Vector{Float64})
@ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface2.jl:0
[9] _pullback
@ ~/.julia/packages/TemporalGPs/NtnD2/src/gp/lti_sde.jl:69 [inlined]
[10] _pullback(::Zygote.Context, ::typeof(TemporalGPs._logpdf), ::AbstractGPs.FiniteGP{TemporalGPs.LTISDE{GP{AbstractGPs.ZeroMean{Float64}, ScaledKernel{KernelSum{Tuple{ScaledKernel{ExponentialKernel{Distances.Euclidean}, Float64}, ConstantKernel{Float64}}}, Float64}}, SArrayStorage{Float64}}, Vector{Int64}, LinearAlgebra.Diagonal{Float64, FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}}}}, ::Vector{Float64})
@ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface2.jl:0
[11] _pullback
@ ~/.julia/packages/TemporalGPs/NtnD2/src/gp/lti_sde.jl:62 [inlined]
[12] _pullback(::Zygote.Context, ::typeof(logpdf), ::AbstractGPs.FiniteGP{TemporalGPs.LTISDE{GP{AbstractGPs.ZeroMean{Float64}, ScaledKernel{KernelSum{Tuple{ScaledKernel{ExponentialKernel{Distances.Euclidean}, Float64}, ConstantKernel{Float64}}}, Float64}}, SArrayStorage{Float64}}, Vector{Int64}, LinearAlgebra.Diagonal{Float64, FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}}}}, ::Vector{Float64})
@ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface2.jl:0
[13] _pullback
@ ~/src/wellbeing/src/explore/src03/gpdemo.jl:16 [inlined]
[14] _pullback(ctx::Zygote.Context, f::var"#nlml#22"{DataFrame, var"#build_model#21"{var"#kernel#20"{ExponentialKernel{Distances.Euclidean}}}}, args::Vector{Float64})
@ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface2.jl:0
[15] _pullback(f::Function, args::Vector{Float64})
@ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface.jl:34
[16] pullback(f::Function, args::Vector{Float64})
@ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface.jl:40
@ Zygote ~/.julia/packages/Zygote/TaBlo/src/compiler/interface.jl:75
[18] (::var"#19#24"{var"#nlml#22"{DataFrame, var"#build_model#21"{var"#kernel#20"{ExponentialKernel{Distances.Euclidean}}}}})(θ::Vector{Float64})
@ Main ~/src/wellbeing/src/explore/src03/gpdemo.jl:26
[19] (::NLSolversBase.var"#gg!#2"{var"#19#24"{var"#nlml#22"{DataFrame, var"#build_model#21"{var"#kernel#20"{ExponentialKernel{Distances.Euclidean}}}}}})(G::Vector{Float64}, x::Vector{Float64})
@ NLSolversBase ~/.julia/packages/NLSolversBase/cfJrN/src/objective_types/inplace_factory.jl:21
[20] (::NLSolversBase.var"#fg!#8"{var"#nlml#22"{DataFrame, var"#build_model#21"{var"#kernel#20"{ExponentialKernel{Distances.Euclidean}}}}, NLSolversBase.var"#gg!#2"{var"#19#24"{var"#nlml#22"{DataFrame, var"#build_model#21"{var"#kernel#20"{ExponentialKernel{Distances.Euclidean}}}}}}})(gx::Vector{Float64}, x::Vector{Float64})
@ NLSolversBase ~/.julia/packages/NLSolversBase/cfJrN/src/objective_types/abstract.jl:13
@ NLSolversBase ~/.julia/packages/NLSolversBase/cfJrN/src/interface.jl:82
@ NLSolversBase ~/.julia/packages/NLSolversBase/cfJrN/src/interface.jl:69
@ Optim ~/.julia/packages/Optim/ocQqB/src/Manifolds.jl:50
[24] (::LineSearches.var"#ϕdϕ#6"{Optim.ManifoldObjective{OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}}, Vector{Float64}, Vector{Float64}, Vector{Float64}})(α::Float64)
@ LineSearches ~/.julia/packages/LineSearches/Ki4c5/src/LineSearches.jl:84
[25] (::LineSearches.HagerZhang{Float64, Base.RefValue{Bool}})(ϕ::Function, ϕdϕ::LineSearches.var"#ϕdϕ#6"{Optim.ManifoldObjective{OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}}, Vector{Float64}, Vector{Float64}, Vector{Float64}}, c::Float64, phi_0::Float64, dphi_0::Float64)
@ LineSearches ~/.julia/packages/LineSearches/Ki4c5/src/hagerzhang.jl:139
[26] HagerZhang
@ ~/.julia/packages/LineSearches/Ki4c5/src/hagerzhang.jl:101 [inlined]
[27] perform_linesearch!(state::Optim.BFGSState{Vector{Float64}, Matrix{Float64}, Float64, Vector{Float64}}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Flat}, d::Optim.ManifoldObjective{OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}})
@ Optim ~/.julia/packages/Optim/ocQqB/src/utilities/perform_linesearch.jl:59
[28] update_state!(d::OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}, state::Optim.BFGSState{Vector{Float64}, Matrix{Float64}, Float64, Vector{Float64}}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Flat})
@ Optim ~/.julia/packages/Optim/ocQqB/src/multivariate/solvers/first_order/bfgs.jl:139
[29] optimize(d::OnceDifferentiable{Float64, Vector{Float64}, Vector{Float64}}, initial_x::Vector{Float64}, method::BFGS{LineSearches.InitialStatic{Float64}, LineSearches.HagerZhang{Float64, Base.RefValue{Bool}}, Nothing, Nothing, Flat}, options::Optim.Options{Float64, Nothing}, state::Optim.BFGSState{Vector{Float64}, Matrix{Float64}, Float64, Vector{Float64}})
@ Optim ~/.julia/packages/Optim/ocQqB/src/multivariate/optimize/optimize.jl:54
[30] optimize
@ ~/.julia/packages/Optim/ocQqB/src/multivariate/optimize/optimize.jl:36 [inlined]
[31] #optimize#91
@ ~/.julia/packages/Optim/ocQqB/src/multivariate/optimize/interface.jl:155 [inlined]
[32] macro expansion
@ ./timing.jl:262 [inlined]
[33] handlexy(df::DataFrame, θ₀::NamedTuple{(:s_kernel, :λ, :s_noise), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, basekernel::ExponentialKernel{Distances.Euclidean})
``````

Trying 15 different combinations of kernels and initial lengthscales, two of them fail with an error and the rest of the `s_kernel` are all very small.

Is there something else I should try, or just conclude the model finds no variation over x values?

I would be inclined to conclude this unfortunately.

If I don’t catch the error it looks like this which I don’t really understand. Why would `sqrt` be getting a negative number?

This is probably just a numerical issue. You’ve got a very large number of repeated observations, which generally isn’t going to be hugely numerically stable in TemporalGPs (with the way that it currently handles things) unfortunately.

1 Like

Thanks, I’m glad that my guesses weren’t wildly off track.

The other thing I’ve been thinking about is performance. Outside the regular Julia performance tips, are there any TemporalGPs performance tips? I wasn’t sure for example if I should try to use `optimize(...; inplace=true)` or if there’s another way to get speed.

``````@time handlexy(d, θ₀s[1], basekernels[1]);
95.976136 seconds (303.83 M allocations: 19.428 GiB, 4.24% gc time)
``````
``````function handlexy(df, θ₀, basekernel)
θ_flat_init, unflatten = ParameterHandling.flatten(θ₀)
unpack(θ_flat::Vector{Float64}) = ParameterHandling.value(unflatten(θ_flat))

# Using `let` to avoid captured variables reported by JET, as in
# https://docs.julialang.org/en/v1/manual/performance-tips/
# though it doesn't seem to help much when I measured with `@time`.
let unpack = unpack

kernel(params) = params.s_kernel^2 * (params.λ * basekernel + ConstantKernel())

build_model(params) = to_sde(GP(kernel(params)), SArrayStorage(Float64))

nlml(θ_flat) = let
θ = unpack(θ_flat)
f = build_model(θ)
-logpdf(f(df.x, θ.s_noise), df.y)
end

local results
try
results = @time Optim.optimize(nlml, θ->gradient(nlml, θ)[1], θ_flat_init, BFGS(), Optim.Options(show_trace=false,); inplace=false)
catch e
if e isa DomainError || e isa ArgumentError
return nothing
else
rethrow()
end
else
return θ_bfgs = unpack(results.minimizer)
end
end
end
``````

Hmm that does appear to be taking quite a while – I’m really not sure why it would allocate so much, given that you’re using the `SArrayStorage` stuff. The `inplace=true` option definitely won’t help much here. Unfortunately I’m really not sure what you could do to improve things here

1 Like

I tried a different dataset with more data. It takes about an hour to fit each model and two of them fail with an exception, presumably some numerical issue. The plots look pretty good, especially the 5/2 kernel – the columns show the initial parameter values.

What I’m concerned about in this run is the actual parameter estimates. They seem to work fine for generating predictions but I’m not sure how to interpret them: the `s_kernel` values are really small and the `λ` values are really large. I normally think of the parameter values as being on the scale of the data, but these seem far removed, despite their apparently good fit in the plots.

How can these values be interpreted, if at all? Is it possible to constrain the estimation so it finds less extreme values? I might think of going with a Bayesian approach but given how slow the pure optimization is, MCMC seems like it would be too slow to apply here.

``````julia> gpfits.θ_bfgs
8-element Vector{NamedTuple{(:s_kernel, :λ, :s_noise), Tuple{Float64, Float64, Float64}}}:
(s_kernel = 1.4908101137969295e-8, λ = 0.025962910168048792, s_noise = 3.5060088658751933)
(s_kernel = 4.8174094535042855e-8, λ = 0.03685441202282729, s_noise = 3.506008865875218)
(s_kernel = 7.508853971558238e-7, λ = 1.2448994513719962e9, s_noise = 3.505258849233501)
(s_kernel = 8.405463328045445e-7, λ = 9.341600494320921e8, s_noise = 3.505312938283503)
(s_kernel = 1.044890095933443e-6, λ = 6.428978024354879e8, s_noise = 3.505258785180537)
(s_kernel = 8.890308859122133e-7, λ = 8.350477376777534e8, s_noise = 3.5053129660305795)
(s_kernel = 7.664342603914228e-7, λ = 1.1949049647764115e9, s_noise = 3.505258747072675)
(s_kernel = 7.971423406040704e-7, λ = 1.0386582201777421e9, s_noise = 3.5053129193865007)
``````
``````using CSV, DataFrames, TemporalGPs, AbstractGPs, KernelFunctions, ParameterHandling, Optim, Zygote, ProgressMeter, ThreadsX, Random, Tables, AlgebraOfGraphics, CairoMakie, Chain

kernel(params, basekernel) = params.s_kernel^2 * (params.λ * basekernel + ConstantKernel())

build_model(params, basekernel) = to_sde(GP(kernel(params, basekernel)), SArrayStorage(Float64))

function handlexy(df, θ₀, basekernel)
θ_flat_init, unflatten = ParameterHandling.flatten(θ₀)
unpack(θ_flat::Vector{Float64}) = ParameterHandling.value(unflatten(θ_flat))

nlml(θ_flat) = let
θ = unpack(θ_flat)
f = build_model(θ, basekernel)
-logpdf(f(df.x, θ.s_noise), df.y)
end

# @show θ₀, basekernel

local results
try
results = @time Optim.optimize(nlml, θ->gradient(nlml, θ)[1], θ_flat_init, BFGS(), Optim.Options(show_trace=false,); inplace=false)
catch e
rethrow()
if e isa DomainError || e isa ArgumentError
return nothing
else
rethrow()
end
else
return θ_bfgs = unpack(results.minimizer)
end
end

θ₀s = map(Iterators.product(
[.1],
inv.([5,15,30,60]),
[ 3., ]
)) do (ν,λ,ϵ)
(s_kernel=positive(ν), λ=positive(λ), s_noise=positive(ϵ))
end
basekernels = [Matern12Kernel(),Matern52Kernel()]

gpfits = foldl(crossjoin, [DataFrame(df=[dframe]), DataFrame(θ₀=vec(θ₀s)), DataFrame(basekernel=basekernels)])

@time transform!(gpfits, AsTable(:) => (xs->ThreadsX.map(x->(handlexy(x.df, x.θ₀, x.basekernel)), rowtable(xs))) => :θ_bfgs)

##
transform!(gpfits,
[:θ_bfgs, :df, :basekernel] => ByRow((θ,d, bk)->posterior(build_model(θ, bk)(d.x, θ.s_noise), d.y)) => :posterior,
)
transform!(gpfits,
:df => ByRow(df->minimum(df.x):Rational(1//1000):maximum(df.x)-(1//1000)) => :pred_x,
)
transform!(gpfits,
[:posterior, :pred_x] => ByRow((f,xs)->marginals(f(xs))) => :marginals
)
mean_sub_std(x, r) = mean(x) .- r * std(x)
mean_add_std(x, r) = mean(x) .+ r * std(x)
transform!(gpfits,
:marginals => ByRow(ByRow(mean)) => :gp_mean,
:marginals => ByRow(ByRow(x->mean_sub_std(x,2))) => :gp_lower,
)
names(gpfits)

datapoints = @chain dframe groupby(:x) combine(nrow, :y => mean)
datapoints_spec = (data(datapoints) * mapping(:x, :y_mean; markersize=:nrow=>(x->x/600)) * visual(Scatter))

gpflattened = DataFrames.flatten(gpfits, [:pred_x, :gp_mean, :gp_lower, :gp_upper])
gp_spec = (
data(gpflattened)
* mapping(:pred_x; row=:θ₀=>x->round.(ParameterHandling.value.(Tuple(x));digits=2), col=:basekernel=>(x->string(typeof(x).name.wrapper)))
* (
mapping(:gp_mean) * visual(Lines)
+ mapping(:gp_lower, :gp_upper) * visual(Band; alpha=.3, color=:gray)
)
)
draw(gp_spec + datapoints_spec)
``````

@willtebbutt

1 Like

I’d like to have some more reasonable uncertainty bounds on the GP means, e.g. something at least wider than 1e-7, which seems like too strong a statement about the lack of structure in the data – I’m reluctant to say “the mean is definitely near-exactly zero over the whole period”.

I tried to use DynamicHMC to see if that might help widen the confidence bounds on the mean. I’m not actually sure if it’s likely to produce better uncertainty, or even if it’s likely to have acceptable performance.

Either way, I must be misunderstanding how to use something. I tried doing it with ParameterHandling and got an error. If you happen to have a moment to see what I’m doing wrong, the codes and errors are below.

``````using CSV, DataFrames, TemporalGPs, AbstractGPs, KernelFunctions, ParameterHandling, Optim, Zygote, ProgressMeter, ThreadsX, Random, DynamicHMC, LogDensityProblems, Distributions

kernel(params, basekernel) = params.s_kernel^2 * (params.λ * basekernel + ConstantKernel())

build_model(params, basekernel) = to_sde(GP(kernel(params, basekernel)),  SArrayStorage(Float64))

function makeloglik(df, θ₀, basekernel)
θ_flat_init, unflatten = ParameterHandling.flatten(θ₀)
unpack(θ_flat) = ParameterHandling.value(unflatten(θ_flat))
ll(θ_flat) = let
θ = unpack(θ_flat)
f = build_model(θ, basekernel)
logpdf(f(df.days_after, θ.s_noise), df.y)
end
end

basekernel = Matern32Kernel()
θ₀ = (;s_kernel=positive(.05), λ=positive(1/10), s_noise=positive(3.))
loglik = makeloglik(df, θ₀, basekernel)

logprior(params) = sum(loglikelihood.(Exponential(1.0), params))

function LogDensityProblems.logdensity(ℓ::typeof(loglik), params)
ℓ(params) + logprior(params)
end

LogDensityProblems.dimension(::typeof(loglik)) = 3

function LogDensityProblems.capabilities(::Type{<:typeof(loglik)})
return LogDensityProblems.LogDensityOrder{0}()
end

n_samples = 2_000

samples =
mcmc_with_warmup(
Random.GLOBAL_RNG,
n_samples;
reporter=NoProgressReport(),
)

LoadError: MethodError: no method matching (::ParameterHandling.var"#unflatten_to_NamedTuple#16"{Float64, NamedTuple{(:s_kernel, :λ, :s_noise), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#14"{Float64, Tuple{Int64, Int64, Int64}, Tuple{Int64, Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}})(::Vector{ForwardDiff.Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#14"{DataFrame, Matern32Kernel{Distances.Euclidean}, var"#unpack#13"{ParameterHandling.var"#unflatten_to_NamedTuple#16"{Float64, NamedTuple{(:s_kernel, :λ, :s_noise), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#14"{Float64, Tuple{Int64, Int64, Int64}, Tuple{Int64, Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}}}}, Float64}, Float64, 3}})
Closest candidates are:
(::ParameterHandling.var"#unflatten_to_NamedTuple#16"{T})(::Vector{T}) where T at ~/.julia/packages/ParameterHandling/nG33f/src/flatten.jl:87
Stacktrace:
[1] (::var"#unpack#13"{ParameterHandling.var"#unflatten_to_NamedTuple#16"{Float64, NamedTuple{(:s_kernel, :λ, :s_noise), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#14"{Float64, Tuple{Int64, Int64, Int64}, Tuple{Int64, Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}})(θ_flat::Vector{ForwardDiff.Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#14"{DataFrame, Matern32Kernel{Distances.Euclidean}, var"#unpack#13"{ParameterHandling.var"#unflatten_to_NamedTuple#16"{Float64, NamedTuple{(:s_kernel, :λ, :s_noise), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#14"{Float64, Tuple{Int64, Int64, Int64}, Tuple{Int64, Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}}}}, Float64}, Float64, 3}})
@ Main ~/gpdemo-bayes.jl:14
[2] (::var"#ll#14"{DataFrame, Matern32Kernel{Distances.Euclidean}, var"#unpack#13"{ParameterHandling.var"#unflatten_to_NamedTuple#16"{Float64, NamedTuple{(:s_kernel, :λ, :s_noise), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#14"{Float64, Tuple{Int64, Int64, Int64}, Tuple{Int64, Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}}})(θ_flat::Vector{ForwardDiff.Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#14"{DataFrame, Matern32Kernel{Distances.Euclidean}, var"#unpack#13"{ParameterHandling.var"#unflatten_to_NamedTuple#16"{Float64, NamedTuple{(:s_kernel, :λ, :s_noise), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#14"{Float64, Tuple{Int64, Int64, Int64}, Tuple{Int64, Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}}}}, Float64}, Float64, 3}})
@ Main ~/gpdemo-bayes.jl:16
[3] logdensity(ℓ::var"#ll#14"{DataFrame, Matern32Kernel{Distances.Euclidean}, var"#unpack#13"{ParameterHandling.var"#unflatten_to_NamedTuple#16"{Float64, NamedTuple{(:s_kernel, :λ, :s_noise), Tuple{ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.Positive{Float64, typeof(exp), Float64}}}, ParameterHandling.var"#unflatten_to_Tuple#14"{Float64, Tuple{Int64, Int64, Int64}, Tuple{Int64, Int64, Int64}, Tuple{ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}}}, ...DualAveraging{Float64}}}, algorithm::DynamicHMC.NUTS{Val{:generalized}}, .Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}, ParameterHandling.var"#unflatten_Positive#24"{Float64, ParameterHandling.Positive{Float64, typeof(exp), Float64}, ParameterHandling.var"#unflatten_to_Real#2"{Float64, Float64}}}}}}}}, Float64}, Float64, 3}}})
@ Forreporter::NoProgressReport)
@ DynamicHMC ~/.julia/packages/DynamicHMC/x07K9/src/mcmc.jl:547
[16] top-level scope
``````

and I also tried without ParameterHandling

``````using CSV, DataFrames, TemporalGPs, AbstractGPs, KernelFunctions, ParameterHandling, Optim, Zygote, ProgressMeter, ThreadsX, Random, DynamicHMC, LogDensityProblems, Distributions

kernel(params, basekernel) = params[1]^2 * (params[2] * basekernel + ConstantKernel())

build_model(params, basekernel) = to_sde(GP(kernel(params, basekernel)),  SArrayStorage(Float64))

function makeloglik(df, basekernel)
ll(θ) = let
f = build_model(θ, basekernel)
logpdf(f(df.x, θ[3]), df.y)
end
end

basekernel = Matern32Kernel()
θ₀ = (;s_kernel=positive(.05), λ=positive(1/10), s_noise=positive(3.))
loglik = makeloglik(df, basekernel)

logprior(params) = sum(loglikelihood.(Exponential(1.0), params))

function LogDensityProblems.logdensity(ℓ::typeof(loglik), params)
ℓ(params) + logprior(params)
end

LogDensityProblems.dimension(::typeof(loglik)) = 3

function LogDensityProblems.capabilities(::Type{<:typeof(loglik)})
return LogDensityProblems.LogDensityOrder{0}()
end

n_samples = 2_000

samples =
mcmc_with_warmup(
Random.GLOBAL_RNG,
n_samples;
reporter=NoProgressReport(),
)
ArgumentError: ScaledKernel: σ² = Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}}(-1.5325166874186795,0.0,1.0,0.0) does not satisfy the constraint σ² > 0.
Stacktrace:
[1] macro expansion
@ ~/.julia/packages/KernelFunctions/hnfwW/src/utils.jl:5 [inlined]
[2] ScaledKernel
@ ~/.julia/packages/KernelFunctions/hnfwW/src/kernels/scaledkernel.jl:20 [inlined]
[3] *
@ ~/.julia/packages/KernelFunctions/hnfwW/src/kernels/scaledkernel.jl:64 [inlined]
[4] kernel(params::Vector{ForwardDiff.Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3}}, basekernel::Matern32Kernel{Distances.Euclidean})
@ Main ~/gpdemo-bayes-raw.jl:7
[5] build_model
@ ~/gpdemo-bayes-raw.jl:9 [inlined]
[6] (::var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}})(θ::Vector{ForwardDiff.Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3}})
@ Main ~/gpdemo-bayes-raw.jl:14
[7] logdensity(ℓ::var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}, params::Vector{ForwardDiff.Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3}})
@ Main ~/gpdemo-bayes-raw.jl:27
[8] #46
[9] vector_mode_dual_eval!
@ ~/.julia/packages/ForwardDiff/pDtsf/src/apiutils.jl:37 [inlined]
[10] vector_mode_gradient!(result::DiffResults.MutableDiffResult{1, Float64, Tuple{Vector{Float64}}}, f::LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, x::Vector{Float64}, cfg::ForwardDiff.GradientConfig{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3, Vector{ForwardDiff.Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3}}})
[14] evaluate_ℓ(ℓ::LogDensityProblems.ForwardDiffLogDensity{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}, ForwardDiff.GradientConfig{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3, Vector{ForwardDiff.Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3}}}}, q::Vector{Float64})
@ DynamicHMC ~/.julia/packages/DynamicHMC/x07K9/src/hamiltonian.jl:195
[15] #initialize_warmup_state#25
@ ~/.julia/packages/DynamicHMC/x07K9/src/mcmc.jl:125 [inlined]
[16] initialize_warmup_state(rng::Random._GLOBAL_RNG, ℓ::LogDensityProblems.ForwardDiffLogDensity{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}, ForwardDiff.GradientConfig{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3, Vector{ForwardDiff.Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3}}}})
@ DynamicHMC ~/.julia/packages/DynamicHMC/x07K9/src/mcmc.jl:123
[17] mcmc_keep_warmup(rng::Random._GLOBAL_RNG, ℓ::LogDensityProblems.ForwardDiffLogDensity{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}, ForwardDiff.GradientConfig{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3, Vector{ForwardDiff.Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3}}}}, N::Int64; initialization::Tuple{}, warmup_stages::Tuple{InitialStepsizeSearch, TuningNUTS{Nothing, DualAveraging{Float64}}, TuningNUTS{LinearAlgebra.Diagonal, DualAveraging{Float64}}, TuningNUTS{LinearAlgebra.Diagonal, DualAveraging{Float64}}, TuningNUTS{LinearAlgebra.Diagonal, DualAveraging{Float64}}, TuningNUTS{LinearAlgebra.Diagonal, DualAveraging{Float64}}, TuningNUTS{LinearAlgebra.Diagonal, DualAveraging{Float64}}, TuningNUTS{Nothing, DualAveraging{Float64}}}, algorithm::DynamicHMC.NUTS{Val{:generalized}}, reporter::NoProgressReport)
@ DynamicHMC ~/.julia/packages/DynamicHMC/x07K9/src/mcmc.jl:496
[18] macro expansion
@ ~/.julia/packages/UnPack/EkESO/src/UnPack.jl:100 [inlined]
[19] mcmc_with_warmup(rng::Random._GLOBAL_RNG, ℓ::LogDensityProblems.ForwardDiffLogDensity{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}, ForwardDiff.GradientConfig{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3, Vector{ForwardDiff.Dual{ForwardDiff.Tag{LogDensityProblems.var"#46#47"{var"#ll#5"{DataFrame, Matern32Kernel{Distances.Euclidean}}}, Float64}, Float64, 3}}}}, N::Int64; initialization::Tuple{}, warmup_stages::Tuple{InitialStepsizeSearch, TuningNUTS{Nothing, DualAveraging{Float64}}, TuningNUTS{LinearAlgebra.Diagonal, DualAveraging{Float64}}, TuningNUTS{LinearAlgebra.Diagonal, DualAveraging{Float64}}, TuningNUTS{LinearAlgebra.Diagonal, DualAveraging{Float64}}, TuningNUTS{LinearAlgebra.Diagonal, DualAveraging{Float64}}, TuningNUTS{LinearAlgebra.Diagonal, DualAveraging{Float64}}, TuningNUTS{Nothing, DualAveraging{Float64}}}, algorithm::DynamicHMC.NUTS{Val{:generalized}}, reporter::NoProgressReport)
@ DynamicHMC ~/.julia/packages/DynamicHMC/x07K9/src/mcmc.jl:547
[20] top-level scope
``````