CategoryStatistical Uncertainty

Recovering from bootstrap intoxication

I want to talk again today about the essential topic of analyzing statistical uncertainty – i.e., making error bars – but I want to frame the discussion in terms of a larger theme: our community’s often insufficiently critical adoption of elegant and sophisticated ideas. I discussed this issue a bit previously in the context of PMF calculations. To save you the trouble of reading on, the technical problem to be addressed is statistical uncertainty for high-variance data with small(ish) sample sizes.

Continue reading

Let’s stop being sloppy about uncertainty

Let’s draw a line. Across the calendar, I mean. Let’s all pledge that from today on we’re going to give honest accounting of the uncertainty in our data. I mean ‘honest’ in the sense that if someone tried to reproduce our data in the future, their confidence interval and ours would overlap.

There are a few conceptual issues to address up front. Let’s set up our discussion in terms of some variable q which we measure in a molecular dynamics (MD) simulation at successive configurations: q_0, q_1, q_2, and so on. Regardless of the length of our simulation, we can measure the average of all the values \overline{q}= \displaystyle\sum_{i=1}^{M} q_i. We can also calculate the standard deviation σ of these values in the usual way as the square root of the variance. Both of these quantities will approach their “true” values (based on the simulation protocol) with enough sampling – with large enough M.

Continue reading