{"id":213,"date":"2018-06-13T21:19:58","date_gmt":"2018-06-13T21:19:58","guid":{"rendered":"http:\/\/statisticalbiophysicsblog.org\/?p=213"},"modified":"2018-12-09T05:08:20","modified_gmt":"2018-12-09T05:08:20","slug":"recovering-from-bootstrap-intoxication","status":"publish","type":"post","link":"https:\/\/statisticalbiophysicsblog.org\/?p=213","title":{"rendered":"Recovering from bootstrap intoxication"},"content":{"rendered":"<p>\n  I want to <a href=\"https:\/\/statisticalbiophysicsblog.org\/?p=180\">talk again<\/a> today about the essential topic of analyzing statistical uncertainty &#8211; i.e., making error bars &#8211; but I want to frame the discussion in terms of a larger theme: our community\u2019s often insufficiently critical adoption of elegant and sophisticated ideas.  I discussed this issue a bit previously in the context of <a href=\"https:\/\/statisticalbiophysicsblog.org\/?p=160\">PMF calculations<\/a>.  To save you the trouble of reading on, the technical problem to be addressed is statistical uncertainty for high-variance data with small(ish) sample sizes.\n<\/p>\n<p>\n  <!--more-->\n<\/p>\n<p>\n  Elegance and effectiveness are not the same thing.  I should know something about it because I used to try to make a living in part by pursuing algorithmic elegance (<a href=\"https:\/\/journals.aps.org\/prl\/abstract\/10.1103\/PhysRevLett.96.028105\">\u2018resolution exchange\u2019 simulation<\/a> and <a href=\"https:\/\/aip.scitation.org\/doi\/abs\/10.1063\/1.1760511\">path sampling for Jarzynski-based free energy calculations<\/a>, to name two of my group\u2019s less-than-world-changing developments).  In other words, I\u2019m both a perpetrator and victim of the problem.\n<\/p>\n<p>\n  Let\u2019s dive into the bootstrap statistical analysis approach, which is truly beautiful and mathematically well-founded, but which can fail in at least one very important way &#8211; <em>if we aren\u2019t careful about using it in a manner consistent with its underlying assumptions<\/em>.  That\u2019s an academic way of saying that I was trying to use bootstrapping \u2018off the shelf\u2019 without understanding it well.  And in some sense, the problem I will discuss results from insufficient data even though the bootstrap is sometimes noted as <a href=\"https:\/\/en.wikipedia.org\/wiki\/Bootstrapping_(statistics)\">useful for small samples \u2026 but also as potentially unreliable in that regime<\/a>; see Schenker as well as Chernick &amp; Labudde articles listed below.\n<\/p>\n<p>\n  First, what is the basic bootstrapping idea?  It is a perfectly valid strategy for estimating statistical variation of arbitrary observables based on a given sample of data.  The sample at hand (e.g., a set of configurations, or a set of energy values or rates) is assumed to be representative and hence is used as a proxy for the true distribution.  From this proxy distribution, which is discrete even if the underlying space is continuous, numerous \u201cbootstrap\u201d <a href=\"https:\/\/en.wikipedia.org\/wiki\/Resampling_(statistics)\">resamples<\/a> can be drawn, which are just samples with replacement.  These bootstrap (re)samples represent a simulation of all possible samples <em>assuming<\/em> the initial sample is representative &#8211; and hence represent all variation consistent with that assumption.\n<\/p>\n<p>\n  With a set of bootstrap samples in hand, it is straightforward to make a confidence interval for an <em>arbitrary <\/em>observable. For each sample we have, we can calculate that observable and make a histogram.  For example, if each sample consists of a single scalar <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-ede05c264bba0eda080918aaa09c4658_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;\" title=\"Rendered by QuickLaTeX.com\" height=\"8\" width=\"10\" style=\"vertical-align: 0px;\"\/>, we could compute the \u201cobservable\u201d <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-a5e0e31e823b4d5c9a90c0d01d5e8fcb_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#120;&#94;&#51;\" title=\"Rendered by QuickLaTeX.com\" height=\"15\" width=\"17\" style=\"vertical-align: 0px;\"\/> for every sample and build a histogram of these.  Likewise, from a set of configurations, we could build a histogram of energy values.  Then, a 95% confidence interval would consist of the 2.5 and 97.5 %ile values from the histogram.\n<\/p>\n<p>\n  For later, remember this <em>bootstrap confidence interval represents the variation expected if the original sample was a good proxy for the true distribution.<\/em>  This is a \u201cfrequentist\u201d idea that we\u2019ll revisit below.  For now it sounds reasonable enough.  And of course, bootstrapping is a very elegant procedure!\n<\/p>\n<p>\n  Let\u2019s move on to the downside.  We\u2019ll consider a concrete, but very simple case where a failure can occur.  Before you read on, or tell yourself a \u201ctoy\u201d example is not important, <em>I want to emphasize that our simple binomial example is highly relevant to certain types of real data, as I will describe later on in the post.<\/em>\n<\/p>\n<p>\n  Assume we want to estimate the probability <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/> of the value 1 in a binomial distribution (where zero occurs with probability <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-25ae9b548488a07106b1ed3181057538_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#49;&#45;&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"39\" style=\"vertical-align: -4px;\"\/>). That is, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/> itself is the observable of interest.  Say the sample we have in hand to estimate <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/> is (1,0,0,0,0).  Of course we would estimate <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-72f17fc6f6eaf5790341a0b1a961c050_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#61;&#49;&#47;&#53;\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"60\" style=\"vertical-align: -5px;\"\/>.  But how confident are we based on just five draws from the binomial distribution?  Perhaps <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/> is actually much smaller and we were lucky to see a single 1.  Or perhaps <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/> was larger and we got unlucky.  How do we think about this?\n<\/p>\n<p>\n  The bootstrapping procedure readily provides a confidence interval, as described above \u2026 but we will see that it is seriously flawed.  Bootstrapping calls for sampling with replacement repeatedly from the original (1,0,0,0,0) sample.  Sampling with replacement means that we draw sets of five elements from the original set and every element is chosen with equal probability. Thus, any element may randomly occur more than once (which is probably easier to think about if we had five different values, but we\u2019ll stick to binomial).  In our binomial case, a couple of the (re)samples might look like (0,1,0,0,1) or (0,0,0,0,0).  Imagine drawing thousands of such samples and estimating <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/> from each, simply based on the fraction of ones \u2013 i.e., 2\/5 or 0 for the preceding examples.  Then we can construct a histogram of these  values and use the 2.5 and 97.5%iles to make a confidence interval.  This interval represents the range of the 95% most likely values <em>if<\/em> our original sample was representative of the true distribution.\n<\/p>\n<p>\n  For the binomial case, there are few enough possibilities that we can easily predict what will happen in a bootstrap process.  In particular, we know that in generating each bootstrap sample we have a 1\/5 chance of a 1 and 4\/5 chance of 0.  Let\u2019s focus on a key occurrence, the outcome (0,0,0,0,0), which leads to <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-ef3696bb3425b816a345f8115a18fef1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#61;&#48;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"43\" style=\"vertical-align: -4px;\"\/> and has a probability of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-b405f13638a82f040c656413c76e7030_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#40;&#52;&#47;&#53;&#41;&#94;&#53;&#32;&#92;&#115;&#105;&#109;&#101;&#113;&#32;&#48;&#46;&#51;&#51;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"103\" style=\"vertical-align: -5px;\"\/>.  This tells us right away that the lower limit of the confidence interval will be 0 (because the 2.5% is less than 33% and hence occurs at <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-ef3696bb3425b816a345f8115a18fef1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#61;&#48;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"43\" style=\"vertical-align: -4px;\"\/> in the histogram).  In fact, since <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-ef3696bb3425b816a345f8115a18fef1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#61;&#48;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"43\" style=\"vertical-align: -4px;\"\/> occupies 33% of the probability, even when we chop off the lowest 2.5%, bootstrapping\u2019s confidence interval still covers  for 30% of its 95%!\n<\/p>\n<p>\n  And there we have the problem: the bootstrapping confidence interval wrongly indicates that there\u2019s a decent chance that <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-ef3696bb3425b816a345f8115a18fef1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#61;&#48;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"43\" style=\"vertical-align: -4px;\"\/>.  Why is that wrong?  Well, our sample of just five binomial values already has a 1 in it.  <em>Although we don\u2019t have much data, we do have enough data to rule out certain models:<\/em> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/> must be finite and not zero.  We can use Bayesian ideas to quantify our common sense, but let\u2019s hold off on that for a moment.\n<\/p>\n<p>\n  Let me now explain why a binomial example has been relevant for my own research on reasonably complex biomolecular systems.  We have been simulating the folding of proteins using the \u201cweighted ensemble\u201d (WE) path sampling method, which can yield unbiased rate estimates.  Each (quite expensive) WE run yields an estimate for the folding rate, but there is a very large variance among these estimates: one run may yield a rate that is several orders of magnitude larger than another.  And a set of 10-30 WE folding runs yields a set of rates whose average is dominated by a small number (&lt; 5) of the runs.  The small number of large values is essentially like the 1\u2019s in a binomial distribution, with the remainder like the 0\u2019s.  In that sense it\u2019s the same situation, and bootstrapping will suggest an artifactually small lower limit for the confidence interval of the folding rate.  So for me, the binomial example is very relevant.  (Note that a \u201cregular\u201d confidence interval based on the standard error of the mean will yield unphysical negative values because the data is highly non-Gaussian.  Note also that one difference between the folding data and binomial samples is that in the binomial case we know that 1 is the largest possible value whereas we don\u2019t know the largest possible value for the folding rate ahead of time.  However, the critique of the lower confidence limit is still valid.)\n<\/p>\n<p>\n  We\u2019ve seen what goes wrong procedurally with the bootstrap in for a high-variance small sample, but what goes wrong theoretically?  Here we start to get into issues of frequentism and Bayesianism, but I want to avoid jargon and abstraction.  Let\u2019s just remind ourselves what the bootstrap procedure does.  It generates a rather complete distribution of outcomes <em>assuming that the sample in hand is a good approximation to the true underlying distribution.<\/em>  But in generating a confidence interval for, say, the <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/> parameter of a binomial distribution, I would argue it\u2019s clear that it\u2019s not really the possible outcomes we care about.  Rather, we want to estimate the likelihood that a given <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/> could have generated the sample we have, which FYI is the Bayesian point of view: find the models most consistent with data.  Also, as with many statistical approaches, bootstrapping seems to be theoretically founded on large-numbers statistics, which clearly are lacking in our example.\n<\/p>\n<p>\n  More concretely, given the binomial sample (1,0,0,0,0), it\u2019s clear this is consistent with <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-72f17fc6f6eaf5790341a0b1a961c050_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#61;&#49;&#47;&#53;\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"60\" style=\"vertical-align: -5px;\"\/>, but what about other values of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/>?  This is easy to quantify in the binomial case because from <a href=\"https:\/\/en.wikipedia.org\/wiki\/Binomial_distribution\">elementary binomial statistics<\/a>, the probability of a sample with a single 1 is <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-284b5ea034d3b0dbc7b709fd80c79020_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#114;&#109;&#123;&#115;&#97;&#109;&#112;&#108;&#101;&#125;&#125;&#32;&#61;&#32;&#53;&#112;&#40;&#49;&#45;&#112;&#41;&#94;&#52;\" title=\"Rendered by QuickLaTeX.com\" height=\"21\" width=\"154\" style=\"vertical-align: -6px;\"\/>.  From a Bayesian perspective, we can use this expression as a basis for estimating the probability that our sample came from <em>any<\/em> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/>.  That is, even though strictly speaking the expression for <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-df9b6c7e6adaf716cfc245c4e2e8d844_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#114;&#109;&#123;&#115;&#97;&#109;&#112;&#108;&#101;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"51\" style=\"vertical-align: -6px;\"\/> gives the fraction of five-element samples expected to have a single 1 for a <em>fixed value<\/em> of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/>, we can reverse the direction of the logic and use <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-df9b6c7e6adaf716cfc245c4e2e8d844_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#114;&#109;&#123;&#115;&#97;&#109;&#112;&#108;&#101;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"14\" width=\"51\" style=\"vertical-align: -6px;\"\/> as our best guess for the probability (density) that our particular sample came from any given value of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/>.  So aside from the issue of normalization, we can think of it as a function of <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf85f1087e9fbed3a319341134ac1a2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"10\" style=\"vertical-align: -4px;\"\/>: <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-3bf39af01fceba2d57e02ef5f866686c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#95;&#123;&#92;&#109;&#97;&#116;&#104;&#114;&#109;&#123;&#115;&#97;&#109;&#112;&#108;&#101;&#125;&#125;&#40;&#112;&#41;\" title=\"Rendered by QuickLaTeX.com\" height=\"20\" width=\"74\" style=\"vertical-align: -6px;\"\/> .  Notice that this function has a maximum at <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-72f17fc6f6eaf5790341a0b1a961c050_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#61;&#49;&#47;&#53;\" title=\"Rendered by QuickLaTeX.com\" height=\"18\" width=\"60\" style=\"vertical-align: -5px;\"\/> and it correctly tells us there is <em>zero<\/em> probability that <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-ef3696bb3425b816a345f8115a18fef1_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#61;&#48;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"43\" style=\"vertical-align: -4px;\"\/> or <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/ql-cache\/quicklatex.com-9e8f69673e3605f009288f06a399d3e2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#112;&#61;&#49;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"42\" style=\"vertical-align: -4px;\"\/>, in sharp contrast to bootstrapping.\n<\/p>\n<p>\n  That\u2019s where we have to end this discussion.  I don\u2019t have the time, space, or expertise to give a complete description of the best way to proceed with small-sample data of a binomial character. (I think a partial solution involves \u2018Bayesian bootstrapping\u2019 &#8211; see our initial efforts <a href=\"https:\/\/arxiv.org\/abs\/1806.01998\">here<\/a>.)  But I hope you have understood the basic reasons why even a beautiful idea like bootstrapping can go wrong.  Of course, it\u2019s fair to say that I was guilty of trying to use bootstrapping without understanding it thoroughly enough.  In any case, I hope my experience will be useful for you.\n<\/p>\n<p>\n  In short, question authority, question precedent!  Examine the assumptions underlying the calculations you do.\n<\/p>\n<p><strong>Further reading<\/strong>\n<\/p>\n<ul>\n<li><em>An Introduction to the Bootstrap,<\/em> Bradley Efron and R.J. Tibshirani, Chapman &amp; Hall\/CRC Monographs on Statistics &amp; Applied Probability (1993).\n  <\/li>\n<li>\n     \u201cQualms About Bootstrap Confidence Intervals,\u201d Nathaniel Schenker, <a href=\"https:\/\/www.tandfonline.com\/doi\/abs\/10.1080\/01621459.1985.10478123\"><em>J. Am. Stat. Assn.<\/em> 80:360-361 (1985)<\/a>.\n  <\/li>\n<li>\n    \u201cRevisiting Qualms about Bootstrap Confidence Intervals,\u201d Michael R. Chernick &amp; Robert A. Labudde, <a href=\"https:\/\/doi.org\/10.1080\/01966324.2009.10737767\"><em>Am. J. of Math. Manag. Sci<\/em>., 29:437-456 (2009).<\/a>\n  <\/li>\n<li>\n    \u201cWeighted Ensemble Simulation: Review of Methodology, Applications, and Software,\u201d Daniel M. Zuckerman and Lillian T. Chong, <a href=\"https:\/\/www.annualreviews.org\/doi\/abs\/10.1146\/annurev-biophys-070816-033834\"><em>Annual Review of Biophysics<\/em> 46:43-57 (2017)<\/a>.\n  <\/li>\n<li>\n     \u201cA Primer on Bayesian Inference for Biophysical Systems,\u201d Keegan E. Hines, <a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0006349515003033\"><em>Biophys. J.<\/em> 108:2103-2113 (2015)<\/a>.\n  <\/li>\n<li>\n    \u201c<a href=\"https:\/\/link.springer.com\/chapter\/10.1007\/978-94-010-1436-6_6\">Confidence Intervals vs Bayesian Intervals<\/a>,\u201d E. T. Jaynes and Oscar Kempthorne in: Harper W.L., Hooker C.A. (eds) Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science (Springer, Dordrecht, 1976).\n  <\/li>\n<li>\n    \u201cThe Bayesian Bootstrap,\u201d Donald B. Rubin, <a href=\"https:\/\/projecteuclid.org\/euclid.aos\/1176345338\"><em>Ann. Statist.<\/em> 9:130-134 (1981)<\/a>.\n  <\/li>\n<li>\n    \u201cQuantifying Uncertainty and Sampling Quality in Biomolecular Simulations,\u201d Alan Grossfield, Daniel M. Zuckerman, <a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S1574140009005027\"><em>Ann. Rep. in Comp. Chem.<\/em> 5:23-48 (2009)<\/a>.\n  <\/li>\n<li> &#8220;Error analysis for small-sample, high-variance data: Cautions for bootstrapping and Bayesian bootstrapping,&#8221; Barmak Mostofian, Daniel M. Zuckerman <a href=\"https:\/\/arxiv.org\/abs\/1806.01998\">https:\/\/arxiv.org\/abs\/1806.01998<\/a>\n  <\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>I want to talk again today about the essential topic of analyzing statistical uncertainty &#8211; i.e., making error bars &#8211; but I want to frame the discussion in terms of a larger theme: our community\u2019s often insufficiently critical adoption of elegant and sophisticated ideas. I discussed this issue a bit previously in the context of [&hellip;]<\/p>\n","protected":false},"author":6,"featured_media":218,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20,14,18],"tags":[],"class_list":["post-213","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-bayesian-statistics","category-path-sampling","category-statistical-uncertainty"],"_links":{"self":[{"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/posts\/213","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=213"}],"version-history":[{"count":10,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/posts\/213\/revisions"}],"predecessor-version":[{"id":227,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/posts\/213\/revisions\/227"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/media\/218"}],"wp:attachment":[{"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=213"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=213"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=213"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}