Analysis of the Hamiltonian Monte Carlo genotyping algorithm on PROVEDIt mixtures including a novel precision benchmark

Susik, Mateusz; Sbalzarini, Ivo F.

doi:10.1101/2022.08.28.505600

Cited by 2 publications

(26 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This is mixture E05, which was already described in Supplementary Material 2 of Ref. [14]. The LR for the locus D12S391 and the sub-sub-source hypothesis that resulted in the largest LR overall is 2.71 · 10 −27 .…”

Section: Resultsmentioning

confidence: 69%

“…Next, we consider the numbers of OotNT (Opposite of the Neutral Threshold [14]) scenarios with true and false contributors, as well as the OotNT rates [14]. The results are again given in Table 2.…”

Section: Resultsmentioning

confidence: 99%

“…The results for HMC are taken from the supplementary materials of Ref. [14], where the priors of Riman et al [12] were used, whereas here we use the modified priors described in Section 2.1. This indicates that the exact choice of priors is not crucial to the reported results.…”

Section: Resultsmentioning

confidence: 99%

“…Low run-to-run variability implies high reproducibility of the results, which is known as precision . We compare the precision of HMC as previously benchmarked [14] with the precision of SVGD and VI. For this, we perform 10 independent repetitions of the analyses for each scenario and compare two statistics: the standard deviation of the resulting log 10 LR and the difference between the largest and smallest log 10 LR.…”

Section: Resultsmentioning

confidence: 99%

“…For the following results, we used 37 2- and 3-contributor filtered Globalfiler™ mixtures that were not a part of the test benchmark [12, 14]. For the list of mixtures, please see Supplementary Material 1.…”

Section: Methodsmentioning

confidence: 99%

See 4 more Smart Citations

Variational inference accelerates accurate DNA mixture deconvolution

Susik

Sbalzarini

2022

Preprint

Self Cite

View full text Add to dashboard Cite

We investigate a class of DNA mixture deconvolution algorithms based on variational inference, and we show that this can significantly reduce computational runtimes with little or no effect on the accuracy and precision of the result. In particular, we consider Stein Variational Gradient Descent (SVGD) and Variational Inference (VI) with an evidence lower-bound objective. Both provide alternatives to the commonly used Markov-Chain Monte-Carlo methods for estimating the model posterior in Bayesian probabilistic genotyping. We demonstrate that both SVGD and VI significantly reduce computational costs over the current state of the art. Importantly, VI does so without sacrificing precision or accuracy, presenting an overall improvement over previously published methods.

show abstract

Section: Resultsmentioning

confidence: 69%

Section: Resultsmentioning

confidence: 99%

Section: Resultsmentioning

confidence: 99%

Section: Resultsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

See 3 more Smart Citations

Variational inference accelerates accurate DNA mixture deconvolution

Susik

Sbalzarini

2022

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

‘Low’ LRs obtained from DNA mixtures: On calibration and discrimination performance of probabilistic genotyping software

McCarthy-Allen,

Bleka,

Ypma

et al. 2024

Preprint

View full text Add to dashboard Cite

The validity of a probabilistic genotyping (PG) system is typically demonstrated by following international guidelines for the developmental and internal validation of PG software. These guidelines mainly focus on discriminatory power. Very few studies have reported with metrics that depend on calibration of likelihood ratio (LR) systems. In this study, discriminatory power as well as various calibration metrics, such as Empirical Cross-Entropy (ECE) plots, pool adjacent violator (PAV) plots, log likelihood ratio cost (Cllr and Cllrcal), fiducial calibration discrepancy plots, and Turing’ expectation were examined using the publicly-available PROVEDIt dataset. The aim was to gain deeper insight into the performance of a variety of PG software in the ‘lower’ LR ranges (∼LR 1-10,000), with focus on DNAStatistX and EuroForMix which use maximum likelihood estimation (MLE). This may be a driving force for the end users to reconsider current LR thresholds for reporting. In previous studies, overstated ‘low’ LRs were observed for these PG software. However, applying (arbitrarily) high LR thresholds for reporting wastes relevant evidential value. This study demonstrates, based on calibration performance, that previously reported LR thresholds can be lowered or even discarded. Considering LRs >1, there was no evidence for miscalibration performance above LR ∼1,000 when using Fst 0.01. Below this LR value, miscalibration was observed. Calibration performance generally improved with the use of Fst 0.03, but the extent of this was dependent on the dataset: results ranged from miscalibration up to LR ∼100 to no evidence of miscalibration alike PG software using different methods to model peak height, HMC and STRmix.This study demonstrates that practitioners using MLE-based models should be careful when low LR ranges are reported, though applying arbitrarily high LR thresholds is discouraged. This study also highlights various calibration metrics that are useful in understanding the performance of a PG system.HighlightsDiscriminatory power and calibration performance of PG software are evaluated.The utility of various calibration metrics are explored in ‘low’ LR ranges.Focus was on DNAStatistX and EuroForMix software using the MLE method.Calibration performance was dependent on Fst value and dataset size.Results suggest reconsideration of lower LR thresholds and cautious reporting of ‘low’ LRs.

show abstract

Analysis of the Hamiltonian Monte Carlo genotyping algorithm on PROVEDIt mixtures including a novel precision benchmark

Cited by 2 publications

References 23 publications

Variational inference accelerates accurate DNA mixture deconvolution

Variational inference accelerates accurate DNA mixture deconvolution

‘Low’ LRs obtained from DNA mixtures: On calibration and discrimination performance of probabilistic genotyping software

Contact Info

Product

Resources

About