Multi-dimensional Markov chains, again
An attempt to formalise some earlier thoughts
In a previous post, I wrote some rambling, very unrigorous and not particularly coherent musings about what would happen if we tried to extend the idea of Markov chains to consider larger chains of events. In this blog post I'll try and and work through a more coherent vision of what this might look like, present some proofs of some behaviours of the structures I define, and think about some potential applications of this.
What is a Markov chain, anyway?
A Markov chain is just a sequence of distinct states which can be moved between, with each state having the property that the probably of moving to any other state depends only on the current state. One possible way of representing a Markov chain is by using a directed graph (digraph) with weighted edges, where the edge weights represent the probability of moving from one state to another, like so:
Figure 1 demonstrates a couple of properties of Markov chains:
- the sum of the weights of the outbound edges from any node must sum to 1;
- the sum of the weights of the inbound edges to any node need not sum to 1;
- nodes can link to themselves (this represents the state not changing);
- each node does not need to have a direct link to another node - this just means that the probability of moving between those two states is zero.
Any digraph can be represented as an adjacency matrix, where the element in the row and th column is the probability of moving from the th state to the th state - or, phrased differently, the probability of moving to state , given that we've just been in state . That is, given an adjaceny matrix for a Markov chain, . Because of property 1 from above, we know that the sum of the element in each row of this adjacency matrix will sum to 1. We call a matrix that satisfies this property right-stochastic or row-stochastic. The left-stochastic matrix that corresponds to Figure 1 would be:
One interesting thing about Markov chains is that we can sometimes find what's known as a stationary phase, which is essentially the state that you'd end up in on average, if you put a large amount of random walkers into the network. The stationary phase (which is a row vector) of a stochastic matrix satisfies . This means that is the left-eigenvector of corresponding to the eigenvalue 1!
previous post: I added LaTeX support to this site
next post: n-gram frequencies as multi-dimensional Markov chains?
view all posts