Incremental online learning of Turing.jl models

schlichtanders · June 15, 2023, 9:54am

Hi there,

I am looking for ways to learn probablistic models on timeseries data in an incremental online fashion, processing data point after data point continuously, as they arrive.

Hence my question whether there are online algorithms to train a Turing.jl model which can do this incremental learning efficiently and will only need constant Memory (i.e. they don’t need to store all the data).

Any help is highly appreciated. If some other packages next to Turing.jl would be good, I am happy to know, it is just that Turing.jl seems to be the goto package for bayesian probablistic modelling.

filchristou · June 16, 2023, 6:33am

Hello! You must be searching for what people are referring to as “Recursive Bayes”. This will also lead you to this discussion. It’s definitely worth also mentioning the work from the github organization biaslab RxInfer.jl, which I think it’s not in the first discussion.

WIth traditional MCMC you will have to resample all your model data with every new measurement you receive. All different techniques I think them more or less as an efficient approximation of that.
If you are interested in a discount factor, i.e. the parameters of interests are evolving with time, you definitely need to move away from traditional MCMC. Sequential Monte Carlo (SMC), for example, targets this. (there is this tool SMC.jl although I haven’t tried that out)

With respect to time series, people have been using a lot Kalman filters. An alternative, if you find that restricting, are Gaussian processes which is a very powerful tool. Both are considered Bayesian and provide nice uncertainty estimations. However the modeling decisions are fixed to (MV-)normal, but that shouldn’t scare you for the second case too much because they manage to get very flexible.

schlichtanders · June 16, 2023, 6:57am

Thank you very very much for your help.

I read through the older discussion. No perfect fit it seems to me: Gen.jl seem to support SMC approaches, but may be hard to tune and setup. SMC.jl itself seems not that widespread or userfriendly like the Turing.jl ecosystem.

I am really looking for some approach which I can recommend others because it is well tested, has a larger user base and is very stable.

With Turing I can use Variational Inference instead of MCMC. Shouldn’t that solve all the performance problems, as it does a (localized) gradient descent instead of using samples?

schlichtanders · June 16, 2023, 6:59am

RxInfer.jl actually mentions directly on their landing page that they support streaming datasets - one of their key features. That sounds very promising indeed! Thank you for the pointer.

EDIT: This seems to be the example notebook to demonstrate the streaming Infinite Data Stream · RxInfer.jl

Topic		Replies	Views
Recursive Bayes Probabilistic programming	7	1410	February 11, 2021
Bayesian Cognitive Modeling with Turing.jl (Part 2,3) Probabilistic programming	0	308	August 20, 2023
Online talk – The Design of Turing.jl Community turing , bayesian-inference	3	405	May 16, 2025
Online talk – Survival Modeling with Turing.jl: A Case Study Community turing , bayesian-inference	12	491	June 2, 2025
Turing.jl for Causal Inference model Probabilistic programming performance , turing	19	1688	February 4, 2022

Incremental online learning of Turing.jl models

Related topics