TSAnalysis: time series analysis and state-space modelling

Hi all,

I am a third-year PhD student in Statistics at the London School of Economics and Political Science.

I just released the first version of TSAnalysis (GitHub - fipelle/MessyTimeSeries.jl: A Julia implementation of basic tools for time series analysis compatible with incomplete data.). This package includes basic tools for time series analysis and state-space modelling. I plan to create an environment for forecasting centred on TSAnalysis and based on my doctoral research.

TSAnalysis is written entirely in Julia (for now, it is a rather small package). In addition to simple Arrays, it uses data structures from LinearAlgebra for symmetric and diagonal matrices. This is particularly beneficial for the stability and speed of Kalman routines, estimation algorithms (e.g., the EM algorithm in Shumway and Stoffer, 1982), and to handle high-dimensional forecasting problems.

This is my first Julia package - any feedback is very welcome!


Bibliography

  • R. H. Shumway and D. S. Stoffer. An approach to time series smoothing and forecasting using the EM algorithm. Journal of time series analysis, 3(4):253–264, 1982.
30 Likes

Was it developed in response to inadequate packages in the R ecosystem? What are the reason for creating a Julia pkg apart from that it is fun and Julia doesn’t have one?

1 Like

Hi,

I developed this package as a lightweight base for some computationally expensive methods that I am using in my PhD thesis (I will release these forecasting methods in the coming months). It was not in direct response to packages (or lack thereof) in R or any other language.

There are several time series packages. However, focusing on state-space modelling, I found that:

  • Alternative filtering and smoothing routines are sometimes unstable, inefficient or not compatible with incomplete data (i.e., data with missing observations).
  • Other approaches are not efficient for computing h-step ahead forecasts in out-of-sample evaluations.

Data structures from LinearAlgebra, the use of @view and structuring the code in simple blocks help, and I needed a single package for state-space modelling. Obviously, it was also for the fun :slight_smile:

9 Likes

Hi Thanks for sharing.

You mention state-space frequently, but from reading the documentation I don’t think you do delay coordinate embeddings but some other (observation/transition equations) state space reconstructions.

Are your methods related in any way with the ones described in Nonlinear Timeseries Analysis by Kantz ?

p.s.: an example showing a prediction made with your package (like this one: Local Modeling & Timeseries Prediction - TimeseriesPrediction ) would make the readme better I feel.

That’s great to see, as the forecast package is pretty much the only thing these days I go back to R for. Are you planning on doing something similar, maybe a TSForecast package that has auto ARIMA, ETS etc?

3 Likes

Hi,

Thank you for your feedback!

@Datseris You are correct, I am not. For now, I am supporting linear state-space models à la Harvey (1990), Durbin and Koopman (2012), and co-authors. There is a small example with kforecast in the readme, but I recon that it is not great. I will expand on that :slight_smile:

@nilshg Yes. In the coming releases I will try to extend support to the following methods:

  • ARIMA models
  • Standard (aka textbook) univariate decompositions (e.g., seasonal adjustments)
  • ACFs ,CCFs and other basic functions for time series analysis. I know there is plenty of support already, but I feel that they should be included in a TS package.

After, it would be nice to cover TVP versions of the models above. I am not planning to release estimation algorithms anytime soon - I think I will rely on Optim for now.


Bibliography

  • Harvey, A. C. (1990). Forecasting, structural time series models and the Kalman filter . Cambridge university press.
  • Durbin, J., & Koopman, S. J. (2012). Time series analysis by state space methods . Oxford university press.
5 Likes

FYI the forecast package in R is really nice. Probably one of the best time-series packages around (and written by a fellow Aussie - Rob Hyndman down in Melbourne).

5 Likes

@fipelle for ARIMA there is https://github.com/Datseris/ARFIMA.jl . It works very well and it is very fast. I haven’t written tests for it because I never cared to released it as a proper julia package, but if you want to add is as a dependency you can just contribute basic tests via a PR and we can release it.

4 Likes

Thank you! Although it is an interesting package, I am already halfway through the completion of the some of the tasks described above, including implementation of the ARIMA.

As soon as I will have all the building blocks for TSAnalysis, I will post a precise list of features I would like to add. This should help us organise and collaborate better :slight_smile:

@nilshg: a first support for the ARIMA model is on dev.

I have documented it only via docstrings and I am currently debugging it. I will probably make some other changes, but it should be already functional enough to try it.

In order to estimate an ARIMA(d,p,q) you need to

  1. Define an ARIMASettings.

    • For example arima_settings = ARIMASettings(Y, 0, 1, 1).
    • Y is a row vector.
  2. Run arima(arima_settings).

The forecast function computes the forecast for the model. I have not implemented a version of forecast to compute the prediction in the original scale of Y yet. I will register a new version once this is done (and the above fully debugged).

Ideally, simple (univariate and linear) state-space models will have a similar sintax.

3 Likes

Hi all,

I have released a new version (v.0.1.2) and it is now being registered. I do not think that this is such a major release to require a new post. However, there are a few important changes:

  • I have added support for ARMA and ARIMA models;

  • I have added tests for the Kalman filter, smoother, ARMA and ARIMA models - this might be less appreciable from a user perspective, but it makes the package more robust;

  • I solved minor bugs and deprecated a few methods that were not being used by any function;

  • I updated the README.md and included two small examples in the /examples/ folder.


Considering the current status of the project I think that it would be easier to proceed as follows:

  • Build on the current functions for ARIMA / ARMA and add support for VARIMA / VARMA models (this should be in v0.1.3);

  • Add support for textbook univariate state-space models (v0.1.4);

  • Add a range of time series functions (e.g., ACF, CCF, …) and time varying versions of the most used models (v.0.1.5 and v0.1.6);

  • Less certain: extend TVP to all models and release v0.2.0.

Note: in the examples, I substituted SimulatedAnnealing() with NelderMead(). Empirically, NelderMead() seems worse on my data. However, it is less computationally expensive and this might help estimating ARIMAs and more complicated models on standard laptops.

4 Likes

Your package looks promising.
Here is a very popular set of benchmark models for TS forecasting (A forecast ensemble benchmark | Rob J Hyndman).
It would be awesome to see this implemented in Julia & I expect it to run much much faster.
It would also give your package a lot of exposure.
Good luck!

4 Likes

Hey @fipelle! Nice to know that there are more state-space models packages in julia. We have some basic implementations for linear state-space models GitHub - LAMPSPUC/StateSpaceModels.jl: StateSpaceModels.jl is a Julia package for time-series analysis using state-space models. maybe we could exchange experiences

4 Likes

Hi,

@guilhermebodin: Absolutely! I am on holiday right now and I paused the updates until mid-January. I think we should probably start by benchmarking our packages on a common number of tests. At the moment, I have a limited number of models implemented. I would have to wait until v0.1.5 to have a more complete set. However, I could start after v0.1.3 (see above for more details).

I see three options to develop meaningful (but simple) benchmarks:

  1. Use the tests suggested by @Albert_Zevelev. This could be a good way to start since it uses M3 competition data (and it is well-known by a machine-learning audience).
  2. Replicate some textbook examples on linear state-space models.
  3. While options 1-2 are interesting, I feel that in the type of conferences I usually present people look at other datasets, such as the FRED-MD and FRED-QD. Furthermore, benchmarks based on univariate models could not be that informative unless the underlying data is recorded for a very long time (i.e., with many observations). Thus, some researchers could be more interested in having benchmarks based on DFM and other multivariate models. We could agree on a number of specifications to use.

What do you think about the above?

2 Likes

If I may, I’d like to suggest you rename the package TimeSeries.jl or TimeSeriesAnalysis.jl. I When fitting something into the top level of the whole Julia ecosystem, I think it’s a good idea to be very explicit about the meanign and scope of a package. TS probably has a very clear meaning to people working in time series, but to the broader Julia community, I suspect it doesn’t.

9 Likes

Thank you for your comment.

I think it is a fair point. I am open to renaming the package to something more explicit. However, it might be wise to wait until the project takes a more defined shape to pick the right name.¹

Do you know if there is an official procedure to rename a package?


¹ TimeSeries.jl is a very nice name, but I fear that there might be already a git project with this name. I will check the register when I come back from the break. I thought about TimeSeriesAnalysis.jl when I started the project, but it sounded a bit too long for my taste. I am open to suggestions though :slight_smile:

1 Like

+1 for TimeSeriesAnalysis.jl

Long names are no problem with tab completion, and make code more readable, IMO.

9 Likes

@fipelle I am down, it would be nice to replicate some results of the predefined time serie of stamp as well.

2 Likes

Looks interesting. How does your linear, state space model relate to the N4SID algorithm in MATLAB’s System Identification Toolbox?

As a PhD student, I took a course in mathematical systems theory, which involved BL Ho’s algorithm from the 1960s (?). Masanao Aoki’s 1987 book on State Space Modeling of Time Series, State Space Modeling of Time Series | SpringerLink, was another step in this direction, introducing SVD and subspace ideas (others may have used the same ideas before; I don’t know the historic developments). The N4SID does similar things as Aoki did, with the inclusion of deterministic inputs. There are many other approaches.

So I’m curious as to whether your approach is related to these methods.

+1 for changing the name to something more explicit. I like the general name TimeSeries.jl, specially if you can create an organization for time series in Julia and have this package as the umbrella package with all the batteries included. I don’t like the name TimeSeriesAnalysis.jl because analysis is just one part of the game. There are probably many synthesis algorithms for time series to consider.

2 Likes