{"id":103,"date":"2015-11-17T16:18:11","date_gmt":"2015-11-17T16:18:11","guid":{"rendered":"http:\/\/statisticalbiophysicsblog.org\/?p=103"},"modified":"2016-03-07T19:58:27","modified_gmt":"2016-03-07T19:58:27","slug":"faq-on-trajectory-ensembles","status":"publish","type":"post","link":"https:\/\/statisticalbiophysicsblog.org\/?p=103","title":{"rendered":"FAQ on Trajectory Ensembles"},"content":{"rendered":"<p><strong>Q: What is a trajectory?<\/strong><\/p>\n<p>A trajectory is the time-ordered sequence of system configurations which occur as all the coordinates evolve in time following some rules &#8211; hopefully rules embodying reasonable physical dynamics, such as Newton\u2019s laws or constant-temperature molecular dynamics.<\/p>\n<p><strong>Q: What is a trajectory ensemble?<\/strong><\/p>\n<p>It\u2019s a set of <em>independent <\/em>trajectories that <em>together<\/em> characterize a particular condition such as equilibrium or a non-equilibrium steady state.\u00a0 That is, the trajectories do not interact in any way, but statistically they describe some condition because of how they have been initiated &#8211; and when they are observed relative to their initialization \u2026 see below.<\/p>\n<p><!--more--><\/p>\n<p><strong>Q: What are the types of trajectory ensemble?<\/strong><\/p>\n<p>There are three types of trajectory ensembles.\u00a0 Assume each to consist of a very large number of trajectories.\u00a0 In the figures, individual trajectories are shown in blue and average behavior is shown schematically with solid lines.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-104 size-full\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/uploads\/2015\/11\/sbb_faq.png\" alt=\"sbb_faq\" width=\"900\" height=\"700\" srcset=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/uploads\/2015\/11\/sbb_faq.png 900w, https:\/\/statisticalbiophysicsblog.org\/wp-content\/uploads\/2015\/11\/sbb_faq-300x233.png 300w, https:\/\/statisticalbiophysicsblog.org\/wp-content\/uploads\/2015\/11\/sbb_faq-788x613.png 788w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/p>\n<ol>\n<li><u>Equilibrium<\/u>. This is a set of trajectories that obeys detailed balance, which in turn implies what might be called \u201carbitrary balance\u201d: for any two arbitrary regions of configuration or phase space, the number of trajectories flowing from one to the other in a fixed time interval will exactly match the reverse flow.\u00a0 The equilibrium trajectory ensemble yields the equilibrium distribution (Boltzmann factor distribution of configurations for fixed temperature) when trajectories are observed at any fixed time.\u00a0 That is, taking a set of configurations x &#8211; one from each trajectory &#8211; at any time point t will yield the Boltzmann-factor distribution of <em>configurations<\/em>.\u00a0 We can write p(x; t) = p<sup>eq<\/sup>(x), independent of t.\u00a0 <a href=\"https:\/\/statisticalbiophysicsblog.org\/?p=92\">Important properties of the equilibrium ensemble are discussed in another post.<\/a>\u00a0\u00a0 <em><u>Making the ensemble.<\/u><\/em>\u00a0 In principle, one can imagine generating an equilibrium trajectory ensemble in a couple of ways \u2026 by somehow knowing p<sup>eq<\/sup>(x) in advance, generating configurations x accordingly, and assigning each coordinate a velocity chosen from the Maxwell-Boltzmann equilibrium velocity distribution \u2026 alternatively, <em>any<\/em> initial distribution will ultimately relax to equilibrium after enough time so long as there are no external sources or sinks; that is, p(x; t &gt; t<sup>eq<\/sup>) = p<sup>eq<\/sup>(x) where t<sup>eq<\/sup> is the time required for the system to relax to equilibrium.\u00a0 See the figure below.<\/li>\n<li><u>Non-equilibrium steady state<\/u>. A non-equilibrium steady-state is a trajectory ensemble in which the distribution of configurations is unchanging in time, but which does <em>not<\/em> obey the detailed balance condition.\u00a0 We can write p(x; t) = p<sup>ss<\/sup>(x) \u2260 p<sup>eq<\/sup>(x).\u00a0 Hence, there are net flows occurring in the phase space &#8230; from some source of trajectories (where they are initiated) to a sink (where they are absorbed).\u00a0 The steady state ensemble is critically important because <a href=\"https:\/\/statisticalbiophysicsblog.org\/?p=8\">it enables estimation of the rate or first-passage time<\/a>\u00a0.\u00a0 <em><u>Making the ensemble.<\/u>\u00a0 <\/em>As in the case of the equilibrium distribution, an <em>arbitrary<\/em> initial distribution will relax to a non-equilibrium steady state so long as trajectories encountering sink points are re-initiated at the source points.\u00a0 Thus, p(x; t &gt; t<sup>ss<\/sup>) = p<sup>ss<\/sup>(x) where t<sup>ss<\/sup> is the time required for the system to relax to the steady state.\u00a0 Both the source and sink can be rather arbitrary and not simply single configurations.\u00a0 For example, in the system\u2019s configuration space, trajectories could be initiated according to an arbitrary distribution on an arbitrary subset of configurations (e.g., 80% in one region and 20% in another) &#8211; and be absorbed on another arbitrary subset.<\/li>\n<li><u>Initialized<\/u>. The initialized trajectory ensemble evolves in time, having started from some specified distribution of configurations (possibly with velocities).\u00a0 This ensemble has not yet relaxed into equilibrium or a steady state.\u00a0 Thus, by definition, it is still in its transient phase, so the distribution of configurations p(x; t), obtained by taking all configurations at time time t, changes in time.\u00a0 See left panels in the figure below.\u00a0\u00a0 <em><u>Making the ensemble.<\/u>\u00a0 <\/em>Such an ensemble can be constructed from an arbitrary initial distribution of configurations &#8211; just by letting each evolve in time according to the natural system dynamics.\u00a0 The simplest example would be starting many trajectories from the same configuration (with different velocities and\/or random seeds) which corresponds to the time evolution of an initial delta function distribution.\u00a0 However, any initial distribution can be used.<img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-105\" src=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/uploads\/2015\/11\/sbb_faq2.png\" alt=\"sbb_faq2\" width=\"900\" height=\"500\" srcset=\"https:\/\/statisticalbiophysicsblog.org\/wp-content\/uploads\/2015\/11\/sbb_faq2.png 900w, https:\/\/statisticalbiophysicsblog.org\/wp-content\/uploads\/2015\/11\/sbb_faq2-300x167.png 300w, https:\/\/statisticalbiophysicsblog.org\/wp-content\/uploads\/2015\/11\/sbb_faq2-788x438.png 788w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/li>\n<\/ol>\n<p><strong>Q: What is an \u201cequilibrium trajectory\u201d?<\/strong><\/p>\n<p>Because equilibrium is an ensemble property that characterizes a set of configurations or trajectories, it is somewhat oxymoronic to describe a single trajectory as equilibrium (or non-equilibrium) in nature.\u00a0 However, informally, the phrase \u201cequilibrium trajectory\u201d is used when a trajectory is not subject to any artificial biasing forces; sometimes there is the implicit notion that the trajectory is visiting high-probability parts of configuration space.\u00a0 Of course, it is difficult to know whether a trajectory is typical or whether it has visited the important parts of configuration space without examining many trajectories or a very long trajectory.\u00a0 If a trajectory were extremely long &#8211; in the sense of exhibiting multiple transitions among all important states of a system &#8211; then it would be more reasonable to describe it as a representation of equilibrium.\u00a0 Unfortunately, for many of the most interesting biomolecular systems, this is currently impractical and is likely to remain so for years to come.\u00a0 One notable exception would appear to be the folding\/unfolding trajectories generated by the DE Shaw group for small proteins &#8211; but note these were generated at elevated temperatures to enable transitions.\u00a0 On a related point, how would one know whether equilibrium-like sampling has been obtained in any given simulation?\u00a0 This question has been addressed by my group and others\u2019 using the notion of \u201ceffective sample size\u201d &#8211; see references below.<\/p>\n<p><strong>Q: What\u2019s the value of a trajectory ensemble compared to a string-like optimal path?<\/strong><\/p>\n<p>Methods for finding optimal paths and statistical average \u201cstrings\u201d have become increasingly popular, and they can be very valuable in principle for describing a dominant mechanism.\u00a0 It is useful to be aware of the pros and cons of such approaches as compared to the trajectory ensemble picture.\u00a0 At least in principle, the trajectory ensemble has several advantages: (i) Steady-state trajectory ensembles directly provide unbiased estimation of the rate via the <a href=\"https:\/\/statisticalbiophysicsblog.org\/?p=8\">Hill relation<\/a>\u00a0, whereas optimal-path approaches have to make inferences based on possibly strong assumptions regarding the connection between high-dimensional landscape and kinetics.\u00a0 I hope to comment on those assumptions in a future post.\u00a0 (ii) Although a single path or string provides a reassuringly concrete picture, a true description of mechanism must account for all the heterogeneity that may be present, such as alternative pathways and\/or path fluctuations.\u00a0 The path ensemble automatically includes all the latter.\u00a0 (iii) Most single-path methods are subject to the well-known \u201ctrapping\u201d phenomenon in which the calculation fails to visit a channel which is most important to the process at hand.\u00a0 Ensemble methods may also suffer from trapping, but some variants may be more robust. \u00a0I hope to expand on this point in a future post.<\/p>\n<p><strong>References<\/strong><\/p>\n<ul>\n<li><a href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC3108504\/\" target=\"_blank\">\u201cHeterogeneous Path Ensembles for Conformational Transitions in Semiatomistic Models of Adenylate Kinase,\u201d<\/a> Bhatt, D. &amp; Zuckerman, D. M. Journal of Chemical Theory and Computation, 2010, 6, 3527-3539<\/li>\n<li><a href=\"http:\/\/www.pnas.org\/content\/86\/18\/6963.abstract\" target=\"_blank\">\u201cReaction path study of conformational transitions and helix formation in a tetrapeptide,\u201d<\/a> Czerminski, R. &amp; Elber, R., Proc. Nat. Acad. Sci., 1989, 86, 6963-6967<\/li>\n<li><a href=\"http:\/\/pubs.acs.org\/doi\/abs\/10.1021\/jp0455430\" target=\"_blank\">\u201cFinite Temperature String Method for the Study of Rare Events,\u201d <\/a>Weinan E, Weiqing Ren, Eric Vanden-Eijnden, J. Phys. Chem. B, 2005, 109 (14), pp 6688\u20136693<\/li>\n<li>Lindorff-Larsen, K.; Piana, S.; Dror, R. O. &amp; Shaw, D. E. <a href=\"https:\/\/www.sciencemag.org\/content\/334\/6055\/517\" target=\"_blank\">\u201cHow fast-folding proteins fold,\u201d<\/a> Science, 2011, 334, 517-520<\/li>\n<li><a href=\"http:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC2538559\/\" target=\"_blank\">\u201cOn the structural convergence of biomolecular simulations by determination of the effective sample size<em>,<\/em>\u201d<\/a> E. Lyman and D. M. Zuckerman, <em> Phys. Chem. B.<\/em> 111:12876-12882 (2007).<\/li>\n<li><a href=\"https:\/\/www.crcpress.com\/Statistical-Physics-of-Biomolecules-An-Introduction\/Zuckerman\/9781420073782\" target=\"_blank\"><em>Statistical Physics of Biomolecules: An Introduction<\/em>,<\/a> Zuckerman, D. M., CRC Press, 2010<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Q: What is a trajectory? A trajectory is the time-ordered sequence of system configurations which occur as all the coordinates evolve in time following some rules &#8211; hopefully rules embodying reasonable physical dynamics, such as Newton\u2019s laws or constant-temperature molecular dynamics. Q: What is a trajectory ensemble? It\u2019s a set of independent trajectories that together [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":106,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11,8],"tags":[],"class_list":["post-103","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-general-biophysics","category-trajectory-physics"],"_links":{"self":[{"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/posts\/103","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=103"}],"version-history":[{"count":6,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/posts\/103\/revisions"}],"predecessor-version":[{"id":126,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/posts\/103\/revisions\/126"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=\/wp\/v2\/media\/106"}],"wp:attachment":[{"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=103"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=103"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/statisticalbiophysicsblog.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=103"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}