Massive amounts of care is involved with setting up the simulations. How much time is spent thinking about the validity?
Why should we care about statistical uncertainties?
Grossfield, Alan, et al. "Best Practices for Quantification of Uncertainty and Sampling Quality in Molecular simulations [Article v1. 0]." Living journal of computational molecular science 1.1 (2018).
Figure 1. RMSD as a measure of convergence. $\alpha$-carbon RMSD of the protein rhodopsin from its starting structure as a function of time.
Grossfield, Alan, et al. "Best Practices for Quantification of Uncertainty and Sampling Quality in Molecular simulations [Article v1. 0]." Living journal of computational molecular science 1.1 (2018).
Figure 2. The equilibration and production segments of a trajectory. “Equilibration” over the time $t_{equil}$ represents transient behavior while the initial configuration relaxes toward configurations more representative of the equilibrium ensemble.
Grossfield, Alan, et al. "Best Practices for Quantification of Uncertainty and Sampling Quality in Molecular simulations [Article v1. 0]." Living journal of computational molecular science 1.1 (2018).
where $N_{max}$ is the maximum number of lags.
Figure 3. Autocorrelation curve for a MCMC simulation of Cineromycin (NOE) with 100k steps.
$N$ observations ${x_{1},...,x_{N}}$ are converted to a set of M "block averages" ${{x_{1}}^{b},...,{x_{M}}^{b}}$, where
$${x_{j}}^{b} = \frac{\sum{k=1+(j-1)n}^{jn} x_{k}}{n} $$From here, we compute the arithmetic mean of the block averages $\bar{x}^{b}$. Followed by computing the standard deviation of the block averages $s(\bar{x}^{b})$.
which is used to calculate the confidence interval on $\bar{x}^{b}$.
Figure 4. Partitioned trajectory partitioned into 5 continuous blocks.
Grossfield, Alan, et al. "Best Practices for Quantification of Uncertainty and Sampling Quality in Molecular simulations [Article v1. 0]." Living journal of computational molecular science 1.1 (2018).
Figure 5. A simple example of MCMC. Left column: A sampling chain starting from a good starting value, the mode of the true distribution. Middle column: A sampling chain starting from a starting value in the tails of the true distribution. Right column: A sampling chain starting from a value far from the true distribution. Top row: Markov chain. Bottom row: sample den- sity. The analytical (true) distribution is indicated by the dashed line Van Ravenzwaaij, Don, Pete Cassey, and Scott D. Brown. "A simple introduction to Markov Chain Monte–Carlo sampling." Psychonomic bulletin & review 25.1 (2018): 143-154.
Hess, Berk. "Convergence of sampling in protein simulations." Physical Review E 65.3 (2002): 031910.
$^{*}$ Grossfield, Alan, et al. "Best Practices for Quantification of Uncertainty and Sampling Quality in Molecular simulations [Article v1. 0]." Living journal of computational molecular science 1.1 (2018).