Rolling forecast with StateSpaceModels.jl

Hello,

I have a question on using StateSpaceModels.jl to do a rolling window prediction of a time series.

This means:

  1. At time t using past data y_1...y_t, predict h steps ahead (\hat{y}_{t+1|t} to \hat{y}_{t+h|t}). This includes fitting the time series model (hyper parameters in the doc of StateSpaceModels.jl).
  2. Then time moves forward so that the actual value y_{t+1} gets available. Update the prediction filter with this new data point to forecast over the new horizon (\hat{y}_{t+2|t+1} to \hat{y}_{t+h+1 |t+1}).

And this second step (which is meant to be iterated) should be done:

  • without refitting the model hyperparameters over the updated timeseries y_1...y_{t+1}
  • without having the filter computation rerun over the entire updated timeseries

i.e. I would like to just move the Kalman filter one step further with the new observation y_{t+1}

Also as a variant of this question, how to use a fitted state-space model to do forecasting on a different time series, keeping all hyperparameters (probably including initial state estimates).

Data and code example

For the illustration, here is a code which creates a noisy sine wave and makes a forecast at some point of the time series (t=kf). It works, but my goal is now to move this forecast instant forward.

using StateSpaceModels
using Random

# noisy data gen
m = 20 # periodicity
nm = 10 # number of periods

t = range(0, m*nm, step=1)
n = length(t)

# noise
σ = 0.3
Random.seed!(0)
e = σ*randn(n)

y = sin.(2*pi*t/m) + e;

Results:

Forecast setup

h = m*2 # Forecast horizon: 2 periods
kf = n-m # Instant at which to forecast: close to the end

# time series fragment up to forecast instant, for fitting:
tfit = t[1:kf]
yfit = y[1:kf]
# time vector of forecast
tf = t[kf] .+ range(1, h, step=1)
# true value, if it exists:
yf_true = [kf+hh<= n ? y[kf+hh] : NaN for hh=1:h]
tf

Forecast with Exponential Smoothing with Seasonality ETS(A,N,A)

mod = ExponentialSmoothing(yfit, seasonal=m)
fit!(mod)
forec = forecast(mod2, h); # plot forec.expected_value

which works as expected:

Only minor isse: I get the warning

Warning: The optimization process converged but the Hessian matrix is not positive definite. This means that StateSpaceModels.jl cannot estimate the distribution of the hyperparameters…

But perhaps this relate to me using a deterministic trend, so that the estimation problem is ill-defined? In particular, in the result the smoothing level and smoothing_seasonal parameters are close to (0.0001), so I suppose that the initial state estimates (level and seasonal levels) have a strong weight in the forecast. and perhaps these initial seasonal levels corresponds to the “seasonal averages” over the time series, I didn’t check. Anyway, my question is not on this!

Hi @pierre-haessig, nice to hear that you are using StateSpaceModels.jl

I understood the request and the answer probably involves using some internal functions such as update_kalman_state! in this file StateSpaceModels.jl/src/filters/univariate_kalman_filter.jl at master · LAMPSPUC/StateSpaceModels.jl · GitHub after running a kalman_filter.

I can try to build a script with your setup on Monday and will post it here!

About the warning. To get the std-dev and p-values of the estimated parameters Optim must be able to calculate the Hessian. Unfortunately it often fails, because of this is will only give a warning and not report statistical inference details on the estimate.

1 Like

@pierre-haessig the recommended way would to rebuild the model with the new observations and deepcopy the hyperparameters. Here is an example:

using StateSpaceModels
using Random

# noisy data gen
m = 20 # periodicity
nm = 10 # number of periods

t = range(0, m*nm, step=1)
n = length(t)

# noise
σ = 0.3
Random.seed!(0)
e = σ*randn(n)

y = sin.(2*pi*t/m) + e;

h = m*2 # Forecast horizon: 2 periods
kf = n-m # Instant at which to forecast: close to the end

# time series fragment up to forecast instant, for fitting:
tfit = t[1:kf]
yfit = y[1:kf]
# time vector of forecast
tf = t[kf] .+ range(1, h, step=1)
# true value, if it exists:
yf_true = [kf+hh<= n ? y[kf+hh] : NaN for hh=1:h]

mod = ExponentialSmoothing(yfit, seasonal=m)
fit!(mod)
forec = forecast(mod, h); # plot forec.expected_value

mod2 = ExponentialSmoothing(y[1:kf + 1], seasonal=m)
mod2.hyperparameters = deepcopy(mod.hyperparameters)
forec2 = forecast(mod2, h); # plot forec.expected_value

Building the system matrices is a bit tricky. For this to work properly, you should always start the y in the same observation or at least in the same “season”. This way, the hyperparameters have the same meaning across different models.

1 Like

Thanks a lot for the answers.

In the end, the solution with update_kalman_state! is tempting in the sense that it avoids recomputing the filter recursion from the beginning, but I’ll have to dig into it.

Perhaps this will involve creating a new method update_state!(sys, y_new) which internally:

  1. concatenate y_new to sys.y
  2. update state

Now a separate question: while looking through the code with this in mind, one aspect of the code striked me as being quite strange: why does all systems/filters objects hold the data they are filtering (in the sys.y field)? Is this common practice in other similar codes? To me (control theory background rather that stats), the filter (with its state) and the data it operates on should be kept separate. In particular, in the two stages I just mention, this would remove the need to concatenate y_new to sys.y which is not needed per say in the state update computation.

You may also be interested in the discussion I had with @baggepinnen here: Robust resampling/interpolation method - #17 by langestefan

My application was identification, the rolling forecast method turned out to work extremely well for that purpose (slow dynamics, noisy data, large disturbances, long forecast horizons).

LowLevelParticleFilters.jl can do many things :slight_smile:

1 Like

Thanks again for the proposition. Here is a rolling implementation

Generate noisy non stationary data

(non stationary to better see the effect of reforcasting)

using StateSpaceModels
using Random

n = 20 # number of samples
t = range(0, n-1, step=1)

# noise
trend = 1
σ = 2
Random.seed!(0)
e = σ*randn(n)

y = trend*t + e;

Forecast setup

h = 3 # forecast horizon
kf_range = 8:16
nf = length(kf_range)

Forecast with refit

using simple smoothing (i.e. constant forecast) to better see the effect of expanding the set of past data

# data structures to hold forecast:
yf_mat = zeros(h, nf) # forecast value
tf_mat = zeros(h, nf) # time of the forecast

for (i, kf) in enumerate(kf_range)
    # forecast data
    yfit = y[1:kf]
    # time vector of forecast
    tf_mat[:,i] = t[kf] .+ range(1, h, step=1)
    # Fit model
    mod_roll = ExponentialSmoothing(yfit) # no trend, no seasonality
    fit!(mod_roll)
    # Forecast
    yf = forecast(mod_roll, h).expected_value; 
    yf = stack(yf) #  Vector{Vector{Float64}} to Array
    yf_mat[:,i] = yf
    
    # Print model params
    α = get_constrained_value(mod_roll, "smoothing_level")
    σ = sqrt(get_constrained_value(mod_roll, "sigma2"))
    l0 = get_constrained_value(mod_roll, "initial_level")
    println("model parameters: α=$(round(α;digits=2)), σ=$(round(σ;digits=1)), l0=$(round(l0;digits=1))")
end

Printed output:

model parameters: α=0.0, σ=2.4, l0=3.7
model parameters: α=0.0, σ=2.8, l0=4.3
model parameters: α=0.3, σ=2.9, l0=2.9
model parameters: α=0.48, σ=2.9, l0=1.7
model parameters: α=0.62, σ=3.0, l0=1.0
model parameters: α=0.73, σ=3.0, l0=0.5
model parameters: α=0.73, σ=2.9, l0=0.5
model parameters: α=0.73, σ=2.8, l0=0.5
model parameters: α=0.73, σ=2.8, l0=0.5

Observation: there is a transition in parameter values:

  • small number fo points (<10): α=0.0 (extreme smoothing=constant level), l0=mean of past observatinos (I suppose)
  • large enough number of points (≥13): α=0.73 (low smoothing), l0=0.5

Forecast with no refit (fixed hyperparams)

First fit a reference model to get parameters.

Implementation trick: due to the transition in parameters value for small time series, we use the complete set, but in practice, the reference model would be fitted on the smallest time series

yfit = y[1:kf_range.stop]
mod_roll_ref = ExponentialSmoothing(yfit) # no trend, no seasonality
fit!(mod_roll_ref)

α = get_constrained_value(mod_roll_ref, "smoothing_level")
σ = sqrt(get_constrained_value(mod_roll_ref, "sigma2"))
l0 = get_constrained_value(mod_roll_ref, "initial_level")
println("Ref model parameters: α=$(round(α;digits=2)), σ=$(round(σ;digits=1)), l0=$(round(l0;digits=1))")

params_mod_roll = mod_roll_ref.hyperparameters;

→ output Ref model parameters: α=0.73, σ=2.8, l0=0.5

Rolling forecast

# Fresh matrix to hold the new forecasts
yf_mat_fixhyper = zeros(h, nf)

# time series fragment up to forecast instant, for fitting:
for (i, kf) in enumerate(kf_range)
    # forecast data
    yfit = y[1:kf]
    # time vector of forecast
    tf_mat[:,i] = t[kf] .+ range(1, h, step=1)
    # Create model, but no fit and reuse instead the reference params values
    mod_roll = ExponentialSmoothing(yfit) # no trend, no seasonality
    mod_roll.hyperparameters = deepcopy(params_mod_roll)
    
    # Forecast
    yf = forecast(mod_roll, h).expected_value; 
    yf = stack(yf) #  Vector{Vector{Float64}} to Array
    yf_mat_fixhyper[:,i] = yf
end

Result

(green: forecast with no refit)

Comments:

  • In both cases, the rolling forecast is properly updated when adding most recent data.
  • The forecast values do not coincide in general due to the difference in model parameters.
  • However, for long enough time series, the parameters happen to converge and thus the latest forecasts are the same.

Finally, I wanted to also have an implementation without new models instantiation, e.g. using update_kalman_state!(model.system, ...) but this more complicated.

I agree, they should be kept separate. At the time holding everything into a single struct seemed an easier way to implement but this can be refactored.

A little bit ugly but I got a script without the model re-instantiation.

using StateSpaceModels
using Random

# noisy data gen
m = 20 # periodicity
nm = 10 # number of periods

t = range(0, m*nm, step=1)
n = length(t)

# noise
σ = 0.3
Random.seed!(0)
e = σ*randn(n)

y = sin.(2*pi*t/m) + e;

h = m*2 # Forecast horizon: 2 periods
kf = n-m # Instant at which to forecast: close to the end

# time series fragment up to forecast instant, for fitting:
tfit = t[1:kf]
yfit = y[1:kf]
# time vector of forecast
tf = t[kf] .+ range(1, h, step=1)
# true value, if it exists:
yf_true = [kf+hh<= n ? y[kf+hh] : NaN for hh=1:h]

mod = ExponentialSmoothing(yfit, seasonal=m)
fit!(mod)
forec = forecast(mod, h); # plot forec.expected_value

filter = StateSpaceModels.default_filter(mod)
kalman_filter(mod; filter)
filter.kalman_state

RQR = mod.system.R * mod.system.Q * mod.system.R'

# These arguments matter for the optimization, this combination forces all the Kalman filter equations to be called
t = 0
tol = 0.0
skip_llk_instants = 0
filter.kalman_state.steady_state = false

StateSpaceModels.update_kalman_state!(
    filter.kalman_state,
    y[kf],
    mod.system.Z,
    mod.system.T,
    mod.system.H,
    RQR,
    mod.system.d,
    mod.system.c,
    skip_llk_instants,
    tol,
    t,
)

# Forecast will be the new Z * a + d
forecast_y1 = StateSpaceModels.LinearAlgebra.dot(mod.system.Z, filter.kalman_state.a) + mod.system.d


StateSpaceModels.update_kalman_state!(
    filter.kalman_state,
    y[kf+1],
    mod.system.Z,
    mod.system.T,
    mod.system.H,
    RQR,
    mod.system.d,
    mod.system.c,
    skip_llk_instants,
    tol,
    t,
)

forecast_y2 = StateSpaceModels.LinearAlgebra.dot(mod.system.Z, filter.kalman_state.a) + mod.system.d

@show forecast_y1, forecast_y2, forec.expected_value
1 Like

Thanks for this update. I have yet to test it, but it looks promising.