Probabilistic performance estimators for computational chemistry methods: Systematic improvement probability and ranking probability matrix. II. Applications

Pernot, Pascal; Savin, Andreas

doi:10.1063/5.0006204

Cited by 9 publications

(15 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In some instances, prediction errors due to model inadequacy can be handled by statistical correction of predictions, which may provide a reliable uncertainty measure [20]. Various surrogate methods have been developed for the estimation of prediction uncertainty, such as bootstrap-based methods, Gaussian process regression, neural networks and deep learning ensembles [21–23]. Gaussian process regression has been employed to identify particular calculations within a given dataset for which the uncertainties exceed a given threshold [24,25].…”

Section: Introductionmentioning

confidence: 99%

Uncertainty quantification in classical molecular dynamics

Wan

Sinclair

Coveney

2021

Phil. Trans. R. Soc. A.

130

View full text Add to dashboard Cite

Molecular dynamics simulation is now a widespread approach for understanding complex systems on the atomistic scale. It finds applications from physics and chemistry to engineering, life and medical science. In the last decade, the approach has begun to advance from being a computer-based means of rationalizing experimental observations to producing apparently credible predictions for a number of real-world applications within industrial sectors such as advanced materials and drug discovery. However, key aspects concerning the reproducibility of the method have not kept pace with the speed of its uptake in the scientific community. Here, we present a discussion of uncertainty quantification for molecular dynamics simulation designed to endow the method with better error estimates that will enable it to be used to report actionable results. The approach adopted is a standard one in the field of uncertainty quantification, namely using ensemble methods, in which a sufficiently large number of replicas are run concurrently, from which reliable statistics can be extracted. Indeed, because molecular dynamics is intrinsically chaotic, the need to use ensemble methods is fundamental and holds regardless of the duration of the simulations performed. We discuss the approach and illustrate it in a range of applications from materials science to ligand–protein binding free energy estimation. This article is part of the theme issue ‘Reliability and reproducibility in computational science: implementing verification, validation and uncertainty quantification in silico ’.

show abstract

Section: Introductionmentioning

confidence: 99%

Uncertainty quantification in classical molecular dynamics

Wan

Sinclair

Coveney

2021

Phil. Trans. R. Soc. A.

130

View full text Add to dashboard Cite

show abstract

“…This lack of correlation supports the main message of this work: The number of fitted parameters does not represent an effective measure of the transferability of a functional. More reliable statistical criteria—such as those developed in this work, or alternatively, the probabilistic performance estimator recently introduced by Pernot and Savin [ 91,92 ] —should be used to evaluate the reliability of new and existing xc functionals.…”

Section: Statistical Criteria Of Bias–variance Tradeoff and Analysis mentioning

confidence: 99%

Fitting elephants in the density functionals zoo: Statistical criteria for the evaluation of density functional theory methods as a suitable replacement for counting parameters

Peverati

2020

Int J of Quantum Chemistry

View full text Add to dashboard Cite

Counting parameters has become customary in the density functional theory community as a way to infer the transferability of popular approximations to the exchangecorrelation functionals. Recent work in data science, however, has demonstrated that the number of parameters of a fitted model is not related to the complexity of the model itself, nor to its eventual overfitting. Using similar arguments, here, we show that it is possible to represent every modern exchange-correlation functional approximations using just one single parameter. This procedure proves the futility of the number of parameters as a measure of transferability. To counteract this shortcoming, we introduce and analyze the performance of three statistical criteria for the evaluation of the transferability of exchange-correlation functionals. The three criteria are called Akaike information criterion, Vapnik-Chervonenkis criterion, and cross-validation criterion and are used in a preliminary assessment to rank 60 exchange-correlation functional approximations using the ASCDB database of chemical data.

show abstract

“…We can relate this to an increasing trend of the errors with the bandgap value. 4 In the case of PER2018, the error distributions present also large skewness and kurtosis, that can be associated with the chemical heterogeneity of the dataset. 2 For THA2015, it was noted previously 35,4 that some experimental reference data with large measurement uncertainty could not be reproduced by any method in the studied set.…”

Section: Application Of G M Cf To Rankingmentioning

confidence: 99%

“…As a benchmarking statistic, the popular mean unsigned error (MUE) bears no information on such a risk. [1][2][3][4] We have recently reported a case where two unbiased error distributions with identical values of the MUE present widely different risks of large errors because of heavy tails in one of them. 5,4 It would therefore be very useful to complement the MUE with a statistic indicating or quantifying the risk of large errors.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Using the Gini coefficient to characterize the shape of computational chemistry error distributions

Pernot,

Savin

2020

Preprint

Self Cite

View full text Add to dashboard Cite

The distribution of errors is a central object in the assesment and benchmarking of computational chemistry methods. The popular and often blind use of the mean unsigned error as a benchmarking statistic leads to ignore distributions features that impact the reliability of the tested methods. We explore how the Gini coefficient offers a global representation of the errors distribution, but, except for extreme values, does not enable an unambiguous diagnostic. We propose to relieve the ambiguity by applying the Gini coefficient to mode-centered error distributions. This version can usefully complement benchmarking statistics and alert on error sets with potentially problematic shapes.

show abstract

Probabilistic performance estimators for computational chemistry methods: Systematic improvement probability and ranking probability matrix. II. Applications

Cited by 9 publications

References 31 publications

Uncertainty quantification in classical molecular dynamics

Uncertainty quantification in classical molecular dynamics

Fitting elephants in the density functionals zoo: Statistical criteria for the evaluation of density functional theory methods as a suitable replacement for counting parameters

Using the Gini coefficient to characterize the shape of computational chemistry error distributions

Contact Info

Product

Resources

About