Markov state models (MSMs) are very popular and have a rigorous basis in principle, but applying them in practice must be done with great caution. There is no guarantee the results will be reliable for complex systems of typical interest unless there is an enormous amount of data and significant expertise and validation goes into the MSM building. And even if those conditions are in place, certain observables likely will be biased.

Continue reading

# Category: Trajectory Analysis

## A “proof” of the discretized Hill Relation

This is yet another one of those things where, after reading this, you’re supposed to say, “Oh, that’s obvious.” And I admit it is kind of obvious … after you think about it for a few minutes! So spend those few minutes now to learn one more cool thing about non-equilibrium trajectory physics.

In non-equilibrium calculations of transition processes, we often wish to estimate a rate constant, which can be quantified as the inverse of the mean first-passage time (MFPT). That is, one way to define a rate constant is just reciprocal of the average time it takes for a transition. The Hill relation tells us that probability flow per second into a target state of interest (state “B”, defined by us) is *exactly* the inverse MFPT … so long as we measure that flow in the A-to-B steady state based on initializing trajectories outside state B according to some distribution (state “A”, defined by us) and we remove trajectories reaching state B and re-initialize them in A according to our chosen distribution.

## What I have against (most) PMF calculations

Such a beautiful thing, the PMF. The potential of mean force is a ‘free energy landscape’ – the energy-like-function whose Boltzmann factor exp[ -PMF(x) / kT ] gives the relative probability* for any coordinate (or coordinate set) x by integrating out (averaging over) all other coordinates. For example, x could be the angle between two domains in a protein or the distance of a ligand from a binding site.

The PMF’s basis in statistical mechanics is clear. When visualized, its basins and barriers cry out “Mechanism!’’ and kinetics are often inferred from the heights of these features.

Yet aside from the probability part of the preceding paragraph, the rest is largely speculative and subjective … and that’s assuming the PMF is well-sampled, which I highly doubt in most biomolecular cases of interest.

## Everything is Markovian; nothing is Markovian

The Markov model, without question, is one of the most powerful and elegant tools available in many fields of biological modeling and beyond. In my world of molecular simulation, Markov models have provided analyses more insightful than would be possible with direct simulation alone. And I’m a user, too. Markov models, in their chemical-kinetics guise, play a prominent role in illustrating cellular biophysics in my online book, Physical Lens on the Cell.

Yet it’s fair to say that everything is Markovian and nothing is Markovian – and we need to understand this.

If you’re new to the business, a quick word on what “Markovian” means. A Markov process is a stochastic process where the future (i.e., the distribution of future outcomes) depends only on the present state of the system. Good examples would be chemical kinetics models with transition probabilities governed by rate constants or simple Monte Carlo simulation (a.k.a. Markov-chain Monte Carlo). To determine the next state of the system, we don’t care about the past: only the present state matters.