I want to try sampling with TemporalGPs because it may be faster on larger data, as mentioned in another thread. The docs say to use Stheno’s examples for reference so I tried that. I had some issues and questions:
-
IIRC there used to be a Soss->Stheno->DynamicHMC example somewhere but I can’t find it anymore. Is it still possible to define these models via Soss?
-
I tried writing with TemporalGPs->DynamicHMC using the Stheno example but I couldn’t figure out how to write/evaluate the prior density.
ERROR: LoadError: MethodError: no method matching logpdf(::GP{AbstractGPs.ZeroMean{Float64},ScaledKernel{TransformedKernel{Matern52Kernel,ScaleTransform{Float64}},Float64}}, ::Array{Float64,1})
using Random
# Load our GP-related packages.
using AbstractGPs
using KernelFunctions
using TemporalGPs
using Distributions
# Load standard packages from the Julia ecosystem
using Optim # Standard optimisation algorithms.
using ParameterHandling # Helper functionality for dealing with model parameters.
using Zygote # Algorithmic Differentiation
using ParameterHandling: value, flatten
using LogDensityProblems, DynamicHMC
# Declare model parameters using `ParameterHandling.jl` types.
params = (
var_kernel = positive(0.6),
λ = positive(2.5),
var_noise = positive(0.1),
)
f_naive(params) = GP(params.var_kernel * transform(Matern52Kernel(), params.λ))
function build_gp(params)
return to_sde(f_naive(params), SArrayStorage(Float64))
end
# Generate some synthetic data from the prior.
n = 100
const x = RegularSpacing(0.0, 0.1, n)
const y = rand(build_gp(value(params))(x, value(params.var_noise)))
# Construct mapping between structured and Vector representation of parameters.
flat_initial_params, unflatten = flatten(params)
function loglik_train(flat_params)
params = value(unflatten(flat_params))
f = build_gp(params)
return logpdf(f(x, params.var_noise), y)
end
function gp_posterior(x, y, p)
kernel = softplus(p[1]) * (Matern52Kernel() ∘ ScaleTransform(softplus(p[2])))
f = GP(kernel)
return posterior(f(x, 0.1), y)
end
logprior(params) = logpdf(build_gp(), params)
# Log joint density
function LogDensityProblems.logdensity(ℓ::typeof(loglik_train), params)
return ℓ(params) + logprior(params)
end
# The parameter space is two-dimensional
LogDensityProblems.dimension(::typeof(loglik_train)) = 2
# `loglik_train` does not allow to evaluate derivatives of
# the log-likelihood function
function LogDensityProblems.capabilities(::Type{<:typeof(loglik_train)})
return LogDensityProblems.LogDensityOrder{0}()
end
pgp = build_gp(value(params))
logpdf(f_naive(value(params)), flat_initial_params)
n_samples = 2
@time mcmc_result = mcmc_with_warmup(
Random.GLOBAL_RNG,
ADgradient(:ForwardDiff, loglik_train),
n_samples,
)
mcmc_result
- Is it possible to use Soss to combine a TemporalGPs model with a linear regression and sample them as a single object?
- I tried copying the AdvancedHMC example out of the docs but got an error I don’t understand.
ERROR: LoadError: MethodError: no method matching extract_components(::Stheno.GaussianProcessProbabilisticProgramme{NamedTuple{(:f1, :f2, :f3),Tuple{Stheno.CompositeGP{Tuple{typeof(*),Float64,Stheno.CompositeGP{Tuple{typeof(∘),Stheno.WrappedGP{GP{AbstractGPs.ZeroMean{Float64},SqExponentialKernel}},Stheno.Stretch{Float64}}}}},Stheno.CompositeGP{Tuple{typeof(*),Float64,Stheno.CompositeGP{Tuple{typeof(∘),Stheno.WrappedGP{GP{AbstractGPs.ZeroMean{Float64},SqExponentialKernel}},Stheno.Stretch{Float64}}}}},Stheno.CompositeGP{Tuple{typeof(+),Stheno.CompositeGP{Tuple{typeof(*),Float64,Stheno.CompositeGP{Tuple{typeof(∘),Stheno.WrappedGP{GP{AbstractGPs.ZeroMean{Float64},SqExponentialKernel}},Stheno.Stretch{Float64}}}}},Stheno.CompositeGP{Tuple{typeof(*),Float64,Stheno.CompositeGP{Tuple{typeof(∘),Stheno.WrappedGP{GP{AbstractGPs.ZeroMean{Float64},SqExponentialKernel}},Stheno.Stretch{Float64}}}}}}}}}}, ::RegularSpacing{Float64})
using AbstractGPs, Stheno
using AdvancedHMC, Zygote, ParameterHandling
using Zygote: gradient
using ParameterHandling
using ParameterHandling: value, flatten
unpack(θ_flat::Vector{Float64}) = value(unflatten(θ_flat))
θ = (
# Short length-scale and small variance.
l1 = positive(0.4),
s1 = positive(0.2),
# Long length-scale and larger variance.
l2 = positive(5.0),
s2 = positive(1.0),
# Observation noise variance -- we'll be learning this as well.
s_noise = positive(0.1),
)
function build_model(θ::NamedTuple)
return @gppp let
f1 = θ.s1 * stretch(GP(SEKernel()), 1 / θ.l1)
f2 = θ.s2 * stretch(GP(SEKernel()), 1 / θ.l2)
f3 = f1 + f2
end
end
θ_flat_init, unflatten = flatten(θ);
# nlml = negative log marginal likelihood (of θ)
function nlml(θ_flat)
θ = unpack(θ_flat)
f = build_model(θ)
return -logpdf(f(x, θ.s_noise), y)
end
# Define the log marginal joint density function and its gradient
ℓπ(θ) = -nlml(θ) - 0.5 * sum(abs2, θ)
function ∂ℓπ∂θ(θ)
lml, back = Zygote.pullback(ℓπ, θ)
∂θ = first(back(1.0))
return lml, ∂θ
end
# Sampling parameter settings
n_samples, n_adapts = 500, 20
# Draw a random starting points
θ0 = randn(5)
# Define metric space, Hamiltonian, sampling method and adaptor
metric = DiagEuclideanMetric(5)
h = Hamiltonian(metric, ℓπ, ∂ℓπ∂θ)
int = Leapfrog(find_good_eps(h, θ0))
prop = NUTS{MultinomialTS, GeneralisedNoUTurn}(int)
adaptor = StanHMCAdaptor(n_adapts, Preconditioner(metric), NesterovDualAveraging(0.8, int.ϵ))
# Perform inference.
samples, stats = sample(h, prop, θ0, n_samples, adaptor, n_adapts; progress=true)
# Inspect posterior distribution over hyperparameters.
hypers = unpack.(samples);
h_l1 = histogram(getindex.(hypers, :l1); label="l1");
h_l2 = histogram(getindex.(hypers, :l2); label="l2");
h_s1 = histogram(getindex.(hypers, :s1); label="s1");
h_s2 = histogram(getindex.(hypers, :s2); label="s2");
display(plot(h_l1, h_l2, h_s1, h_s2; layout=(2, 2)));
I’m interested in any thoughts or suggestions.