What the **** is an HMM?
Skip to the good bits below if you’re in a rush
Hidden Markov Models (HMMs for short) are a statistical modeling framework that is ubiquitous in signal processing, bioinformatics and plenty of other fields. They capture the distribution of an observation sequence (Y_t) by assuming the existence of a latent state sequence (X_t) such that:
- (X_t) follows a (discrete time, discrete space) Markov chain
- for each t, the distribution of Y_t is entirely determined by the value of X_t
Imagine we are given an observation sequence (Y_t) and a parametric family of HMMs \{p_\theta\}. Following this famous tutorial, we can list three fundamental problems, each of which has a solution that relies on dynamic programming:
Problem | Description | Solution |
---|---|---|
Evaluation | Compute the likelihood of the observation sequence p_\theta(Y) for a fixed \theta | Forward algorithm |
Decoding | Compute the most likely state sequence \arg\max_X p_\theta(X \mid Y) for a fixed \theta | Viterbi algorithm |
Learning | Find the best parameter \arg\max_\theta p_\theta(Y) | Baum-Welch (EM) algorithm |
Who are the competitors?
Of course, you saw me coming a mile away: there are already some Python packages that provide HMM functionalities. The main two are:
- hmmlearn – this one seems dormant
- pomegranate – this one does plenty of other things, it truly is an amazing toolkit, and its maintainer was very friendly when I asked for help
There is also a Julia package called HMMBase.jl, developed by @maxmouchet a while ago. But for my PhD research, I found myself wanting things that none of these implementations provided. So I rolled up my sleeves and created HiddenMarkovModels.jl.
What follows is not a criticism of past efforts: if anything, it is an hommage. HMMBase.jl in particular was my main source of inspiration, and much of my code comes from it. I have discussed it with @maxmouchet, and we agreed to declare my package the official successor to HMMBase.jl. Since the interface is very similar, users shouldn’t struggle too much with the transition.
As a side note, one can always rephrase an HMM as a probabilistic program and throw some MCMC or variational inference at it. But at least in the basic case I describe, it will be much slower than dedicated routines. That is why I don’t include packages such as Turing.jl in the comparison.
What can the package do?
To get a feel for the package, check out its documentation at https://gdalle.github.io/HiddenMarkovModels.jl/. Here is a feature comparison between HMMBase.jl and HiddenMarkovModels.jl:
Feature | HMMBase.jl | HiddenMarkovModels.jl |
---|---|---|
Number types | Float64 |
anything |
Observation types | Number or Vector |
anything |
Observation distributions | from Distributions.jl | satisfying DensityInterface.jl |
Priors / structures | no | customizable |
Autodiff | no | forward mode (for now) |
Multiple sequences | no | parallelized |
Reaching this level of generality was my main motivation. Indeed, my doctoral research involved differentiating the likelihood of HMMs with multiple sequences of Poisson process observations. This kind of funky combination is now possible with very little additional work from the user.
How reliable is it?
The package follows all the best practices that I’m aware of:
- quality checks with Aqua.jl
- correctness and type stability checks with JET.jl (shoutout to @aviatesk as always)
- coherence checks against HMMBase.jl
- integration checks with third-party packages like SparseArrays, StaticArrays.jl and ForwardDiff.jl
- documentation with Documenter.jl
- benchmarks compatible with PkgBenchmark.jl
Does it run fast?
In addition to the features, there is a number of tricks that I used to speed up the code: avoiding allocations, calling linear algebra subroutines, multithreading across sequences, etc.
The results are pretty amazing on small- to medium-scale problems. HiddenMarkovModels.jl blows HMMBase.jl out of the water, and compares quite favorably to hmmlearn (which has a NumPy backend) and pomegranate (which has a PyTorch backend), even though I ran it on a single thread.
Complete reproducible benchmarks with their explanations can be found at https://gdalle.github.io/HiddenMarkovModels.jl/dev/benchmarks/, they’re run automatically before each docs build.
Where will it go next?
The package is currently awaiting registration (see this PR to the general registry).
Here are some of the things that I would like to work on soon-ish:
- specification and testing for an
AbstractHMM
interface, perhaps with Interfaces.jl - numerical stability in large-dimensional settings with sparse transitions
- reverse mode autodiff with ChainRules.jl
- SIMD optimization with LoopVectorization.jl or Tullio.jl
- spectral estimation methods
- input-output HMMs in my other package ControlledMarkovModels.jl (don’t look at it yet, it’s ugly and unmaintained!)
Contributors are welcome!
In the long run, I will probably transfer this package to JuliaStats, but for now I’d like to keep control until things are stabilized.
I’ve read everything, what now?
If you’ve made it this far, congrats. Take a break. Treat yourself to a fruit juice. Chill.
And whenever you’re ready, you can run the command below, then look at the tutorial to start playing around: https://gdalle.github.io/HiddenMarkovModels.jl/dev/tutorial/.
pkg> add https://github.com/gdalle/HiddenMarkovModels.jl
If you have any suggestions or run into trouble, feel free to open an issue or use this thread! Also, a little star on the repo is always appreciated if you find it useful