TSAnalysis: time series analysis and state-space modelling

I think you are sub-estimating the power of abstraction in Julia. You can have a custom type like the one in TimeSeries.jl, which is just data indexed with times (metadata), behave like a generic Julia array. It is important to notice that this definition doesn’t constrain users, nor developers, it actually helps a lot. I did similar modeling in GeoStats.jl for spatial types where the indexing is more complicated and requires various domain types as opposed to points in the real line. They are quite convenient to use. More important than having models, is to have a nice abstraction over time series including (1) types (TimeArray already implemented in TimeSeries.jl that could be renamed to something more general like TimeSeries, unfortunately the name of the package would need to change) and (2) a set of verbs for the operations that can be performed with time series like forecast(ts, newtimes), interpolate(ts, newresolution), simulate(ts, times).

In my opinion the work of the community here has to concentrate on item (2). Defining a set of verbs (a.k.a. API) for time series that encompasses all things you can possibly do with time series. After these verbs are defined, then you can start mapping the models out there to a common package, possibly migrating them to a TimeSeriesModels.jl package, that would be re-exported by TimeSeries.jl. That is how I would do it.

If you want to get some inspiration, check the GeoStats.jl stack, particularly the GeoStatsBase.jl package that contains the abstractions I mentioned.

1 Like

I think this is where our opinion differs. Having a metadata that keeps track of the data ordering or release dates (e.g., a Date vector) is not necessary to perform most operations. There might be cases where this is adavantageous or necessary. However, within my domain, they are relatively few and they can be easily handled by passing an extra argument to a few functions.

Furthermore, the common ground for most packages is the use of Arrays. In my view, non-experienced Julia developers would find easier to interface multiple packages if the data is all in the same format.

Point 2 is partially implemented in TSAnalysis.jl. In about two weeks, I will open a new dev branch for the VARIMA models. I am re-using the same structure as for the ARIMAs (which will be a special case of the VARIMAs), including the forecast function. simulate will follow.

1 Like

I implemented some generic methods for Estimation of linear statespace models in https://github.com/baggepinnen/ControlSystemIdentification.jl
I’m not making use of subspace methods like n4sid though.

Edit: n4sid is implemented now.

2 Likes

I think that is precisely where you are underusing the type system. You don’t need to know that there is metadata hanging around, and you should be able to treat a time series type exactly as a default Julia array after you implement the interfaces: https://docs.julialang.org/en/v1/manual/interfaces

This is not true. Most packages create their own types to leverage multiple-dispatch in non-trivial ways. Furthermore, when it is appropriate, these same types implement the interfaces above and users won’t be able to tell that they are not Julia arrays unless they print the type on the screen.

Can you point out which verbs are implemented? What is your vision for what can be possibly done with time series data? If that vision is not clearly externalized and made explicit with an interface, people won’t have clarity about where things should go in terms of research and development.

I will definitely take a look. I think that it would be interesting to benchmark ControlSystemIdentification.jl as well with guilhermebodin’s StateSpaceModels.jl and TSAnalysis.jl. I am not suggesting we should merge them, but it would be interesting to understand what is the most efficient way to implement these methods - at least, at their simplest form.

I will release more details with the new version - soon :slight_smile:

1 Like

¹ TimeSeries.jl is a very nice name, but I fear that there might be already a git project with this name. I will check the register when I come back from the break. I thought about TimeSeriesAnalysis.jl when I started the project, but it sounded a bit too long for my taste. I am open to suggestions though :slight_smile:

I, too, vote for TimeSeriesAnalysis.jl :slightly_smiling_face:

I don’t think that TimeSeriesAnalysis.jl is all that long - it is very clear as to what it ia aimed at, it seems to me.

A small update. I released a separate package for penalised vector autoregressions compatible with incomplete data. This is a replication code for the empirical application of a new paper of mine (linked in the Git page). This new package is not registered and it probably never will. Over time, I will merge some of the new features in TSAnalysis.jl.

I will soon start again updating TSAnalysis.jl. I plan to rename it either at the next update or at the following one.

2 Likes

Did you decide on a name for the package?

In electrical engineering (and probably other fields), a distinction is made between signals and systems, where a system essentially is a transformation between two sets of signals.

“Time Series” sounds like the focus is on signals. In systems theory and control engineering, the focus is more on systems, hence “Systems Identification” and similar phrases are more common.

Even when there is only one signal/an input signal, one might think of the signal as generated from some stochastic signal which is unknown. Thus, control engineers tend to even consider a time series (a signal) to be a system.

In any way, the choice of package name probably reflects the field of the developer/the history of her/his field.

I am undecided between a few names, including TimeSeriesAnalysis. I suppose I will choose on the spot and on the basis of the development plan. Maybe it would be wise to limit a bit the scope of the project to an area of time series, rather than having a package for all possible models.

However, I would really like to have “time series” in the name, as an indication of my field.

1 Like

I have been following this discussion thread. A time series analysis package for Julia would be an excellent addition to the Julia community!

As I have it understood, there are packages in Julia, like TimeModels.jl under JuliaStats, but nothing specific to time series, as you are doing, @fipelle.

Thank you, and I am very much looking forward to this :smiley: .

2 Likes

@fipelle Is there any way you could provide an “auto.arima”-like function, as can be found in R?

I did not like how after running an MCMC on my differenced time series data, I wasn’t getting the results I was expecting. Sometimes, the “standard error” doesn’t apply, in which case, I can always try a Student’s T random variable, with more degrees of freedom, yet…

It’s tough getting the right p, d, q parameters. I have already plotted the correlograms and found some seasonal behavior in the data. Should I try a periodogram also? For reference, I was following this online guide: https://towardsdatascience.com/arima-models-with-turing-jl-81dcf2a1094c.

2 Likes

Sure. There are different ways to do it. For instance, you could use the techniques I described in my latest paper. I think I will implement something similar and based on implementations in https://github.com/fipelle/ElasticNetVAR.jl. I will also try to add a version of the Politis and Romano’s stationary bootstrap.

I am afraid it won’t be in the next release. As you can see in the dev branch I am currently working on implementing the VARIMA. I will do it, once this is in production. Would you mind adding an issue on the git page?

4 Likes

Sounds like great progress. I look forward to test driving your VARIMA.
Btw, it would be awesome if @ some point it can be extended to VARFIMA.

1 Like

That’s great!

Before getting into fractional methods I think it would be a good idea to implement state-space approaches to model seasonality and other features in the data (jointly with the VARIMA). I did not see many packages around that allow for these hybrids.

1 Like

Not a problem. I can do this.

You can start the test drive by adding the dev version of TSAnalysis via ]add TSAnalysis#dev.

I am finishing the debugging. I will register the new version soon enough.

Note: the readme in the dev branch explains how to use the VARIMA functions.

1 Like

Looks really cool! Rn I only have time to test out the examples in the readme.
Quick question, in

Y = Y_df[:,2:end] |> JArray{Float64};

what does JArray do?

It is just a lazy alias:

const JArray{T, N} = Array{Union{Missing, T}, N};

I am using it to refer to data with (potentially) missing observations.

I have noticed that people are still following / liking this old post. I just wanted to stress that the package was renamed to MessyTimeSeries as described in https://discourse.julialang.org/t/messytimeseries-jl-and-messytimeseriesoptim-jl.