Hidden Markov Models in Python: A simple Hidden Markov Model with Known Emission Matrix fitted with hmmlearn

The Hidden Markov Model

Consider a sensor which tells you whether it is cloudy or clear, but is wrong with some probability. Now, the weather *is* cloudy or clear, we could go and see which it was, so there is a “true” state, but we only have noisy observations on which to attempt to infer it.  

We might model this process (with the assumption of sufficiently precious weather), and attempt to make inferences about the true state of the weather over time, the rate of change of the weather and how noisy our sensor is by using a Hidden Markov Model. 

The Hidden Markov Model describes a hidden Markov Chain which at each step emits an observation with a probability that depends on the current state. In general both the hidden state and the observations may be discrete or continuous.

But for simplicity’s sake let’s consider the case where both the hidden and observed spaces are discrete. Then, the Hidden Markov Model is parameterised by two matrices: 

the transition matrix: which defines the probabilities of the Markov chain transitioning from one state to the next

The emission matrix: which defines the probabilities of each observed state given the corresponding hidden state. 

Or, formally:

Let Xn and Yn be discrete time stochastic processes, where n. Then the pair (Xn, Yn) is a HMM if:

  • Xn is a markov process, and not observed
  • P(Yn in A | Xn for n in N) = P(Yn in A | Xn )

For all n greater than 1, and some arbitrary measurable set A.

Applications of Hidden Markov Models

The most immediately obvious application for these models might be something like the toy example with the weather: a noisy measurements of some series in space or time. 

In that example, one or neither of the two matrices might be known: in particular we may have a strong prior belief about how often the sensor is wrong from conducting experiments, hence we might know the emission matrix, but not the transition matrix. 

Hidden Markov Models in python: Hmmlearn

The easiest Python interface to hidden markov models is the hmmlearn module. We can install this simply in our Python environment with:

conda install -c conda-forge hmmlearn

Or

pip install hmmlearn

Toy data

First of all, let’s generate a simple toy dataset by specifying the generating process for our Hidden Markov model and sampling from it. Here we will use pyro to generate our random samples.

import torch
import pyro
transition = torch.Tensor([[0.9,  0.05, 0.05],
                              [0.05, 0.9, 0.05],
                              [0.05, 0.05, 0.9],
                              ]
                               )
emission = torch.Tensor(
    [
        [0.6, 0.3, 0.1],
        [0.1, 0.7, 0.2],
        [0.2, 0.3, 0.5],
    ])
    
samples = []
    
for j in torch.range(0, 50):
    observations = []
        
    state = pyro.sample("x_{}_0".format(j),
                                    dist.Categorical(torch.Tensor([1.0/3, 1.0/3, 1.0/3])),
                                   )
    emission_p_t = emission[state]

    observation = pyro.sample("y_{}_0".format(j),
                                    dist.Categorical(emission_p_t),
                                   )
    for k in torch.range(1, 50):
        transition_p_t = transition[state]
            
        state = pyro.sample("x_{}_{}".format(j, k),
                                    dist.Categorical(transition_p_t),
                                   )
        emission_p_t = emission[state]
            
        observation = pyro.sample("y_{}_{}".format(j, k),
                                    dist.Categorical(emission_p_t),
                                   )
        
        observations.append(observation)
            
    sample = torch.Tensor(observations)
    samples.append(sample)

sequences = torch.vstack(samples)
lengths = torch.Tensor(np.array([len(sequence) for sequence in sequences_tensor])).to(torch.int)

Inference

Next up is to define our mode in terms of hmmlearn! For us, matching discrete hidden states to discrete observations means we need a Multinomial model.

from hmmlearn import hmm
model = hmm.MultinomialHMM(n_components=3, n_iter=10000, params="st", init_params="st")

Now, this is where things get a little more complicated: let us consider that the emission matrix for our system was known. Then we only want to perform inference on the starting states and transition matrix, hence we give hmmlearn the arguments:

params="st"

And

init_params="st"

Now, all that is left to do is fill in the known emission matrix:

model.emissionprob_ = emission.cpu().numpy().astype(float)

Now to perform inference! To satisfy hmmlearn’s api, we need to reshape the data a bit first. Importantly, since we will remove the dimension of the data that tells us which sequence is which, we also need to tell hmmlearn the lengths of each sequence in the now one dimensional data. 

model.fit(sequences.cpu().numpy().reshape(-1, 1).astype(int), lengths.cpu().numpy().astype(int))

Critique the model

In order to get our inferred transition values, all we need to do is inspect our stateful HMM object!

model.transmat_
array([[0.92999261, 0.03601261, 0.03399478],
[0.03824865, 0.92332934, 0.03842201],
[0.04142672, 0.03791173, 0.92066155]])

As we can see, this is an excellent fit! Despite emission probabilities being highly confounding, we have managed to recover almost the exact transition matrix!

The full posterior: MCMC with numpyro

Hmmlearn uses the Baum-Welch algorithm under the hood to fit these parameters, which means that this is only a maximum likelihood estimate: i.e. these are the single most likely values for the parameters. Why should this bother us? Well, we have no idea of the variance! Unfortunately, this functionality is not easily available in Python, and to get it we would need to employ the much heavier machinery of Markov Chain Monte Carlo, Stochastic Variational inference or some other general inference algorithm. 

An excellent example of a Python library providing such features is Numpyro. With Numpyro, all we need to do is specify our model (the Hidden Markov Model) in terms of Numpyro’s random variables, then put our model into it’s inference engine.

Of course, while that is only two dozen or so lines of code, most of them are pretty magical as the backend is doing an awful lot to bring us such a succinct specification! I won’t cover how to handle the above model in Numpyro, as an adequate explanation of using it to find the posterior distribution of the transition probabilities is beyond the scope of this post, but I can strongly recommend those interested take a look at the following examples.

https://num.pyro.ai/en/stable/examples/hmm_enum.html

Author