Basic question about Hidden Markov Models

I don’t have any experience with Hidden Markov Models but I have a problem that I’m trying to solve that I believe fits within the HMM framework. I’ve been reading about HMMs (this resource is particularly nice) and I’ve been playing with some toy examples via HMMBase.jl.

The question I have is, in the real world, how would I know the transition probabilities for a process I’m unable to observe? I can make educated guesses about the observation likelihoods/emission probabilities but in my case, I don’t really have any idea what the transition probabilities would be.

I’ve thought about using a particular data source as a proxy for the hidden process but, if I can do that in a way that’s satisfactory, it seems to me that I would just want to use that data source/model to solve my problem.

Can anyone provide some advice, point me to some resources that can assist, or discuss a similar problem where the transition probabilities had to be approximated somehow?


Usually you estimate both the hidden transition probabilities and the observation distributions using data, similarly to any other model. If your methodology allows (eg Bayesian/MAP), you can incorporate the “educated guesses” as informative priors.

That said, HMMs can suffer from weak identification for a lot of practical examples (aside from the trivial index exchange one). The more you can constrain the process using a priori knowledge, the better.


Thanks, Tamas. I was actually hoping you would see this and provide some insight :slightly_smiling_face:

Yeah, sometimes you can pin down the structure/entries of your transition probability matrix from physical reasoning/domain knowledge up to some unknown parameters, which you then can estimate jointly with the latent trajectory using Bayesian methods. This about the same as having a prior on the transition probability matrix, but it’s just more natural to think about transition densities with unknown parameters sometimes.


In case anyone else finds this, Turing has a nice tutorial on this:


You can also think of the transition probabilities as their corresponding stochastic process, for example if you have a Normal transition probably it means the hidden state is a Brownian motion, a process that can wander up and down, with a “flexibility” that depends on the variance of the normal.
But you could also use a Ornstein–Uhlenbeck process, which tends to gravitate around a mean value.


I can relate to this a lot! Part of my work is just about paying attention the continuous time models like Brownian motion in the background to make inference for discrete time process like a Markov chain easier.