2015
DOI: 10.1101/035063
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

State aggregation for fast likelihood computations in molecular evolution

Abstract: Codon models are widely used to identify the signature of selection at the molecular level and to test for changes in selective pressure during the evolution of genes encoding proteins. The large dimensionality of the Markov processes used to model codon evolution makes it difficult to use these models with large biological datasets. We propose here to use state aggregation to reduce the dimensionality of codon models and, thus, improve the computational performance of likelihood estimation on these models. We… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2017
2017
2017
2017

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 47 publications
0
2
0
Order By: Relevance
“…If lumpability is violated, the phylogenetic inference may be also biased, which was shown for DNA data when four nucleotides were recoded into fewer groups (Vera-Ruiz et al 2014). At the same time, if lumpability is valid, state aggregation can be used in algorithms to reduce the size of rate matrix which improves computational performance (Davydov et al 2017). The simulations for the two-scientist paradox, despite showing a substantially better fit for HMM, detected insignificant differences between HMM and MM rate estimates, which converged on the generating rates (mean squared error was 7×10 -4 and 9×10 -4 respectively).…”
Section: The Two-scientist Paradox: Hidden Markov Modelsmentioning
confidence: 99%
“…If lumpability is violated, the phylogenetic inference may be also biased, which was shown for DNA data when four nucleotides were recoded into fewer groups (Vera-Ruiz et al 2014). At the same time, if lumpability is valid, state aggregation can be used in algorithms to reduce the size of rate matrix which improves computational performance (Davydov et al 2017). The simulations for the two-scientist paradox, despite showing a substantially better fit for HMM, detected insignificant differences between HMM and MM rate estimates, which converged on the generating rates (mean squared error was 7×10 -4 and 9×10 -4 respectively).…”
Section: The Two-scientist Paradox: Hidden Markov Modelsmentioning
confidence: 99%
“…First, we used a vertebrate one-toone orthologs dataset (Studer et al 2008, available at http://bioinfo.unil.ch/supdata/ positiveselection/Singleton.html) consisting of 767 genes (singleton dataset). This dataset was already used in previous studies of codon models (Fletcher and Yang, 2010;Gharib and Robinson-Rechavi, 2013;Davydov et al, 2017).…”
Section: Vertebrate and Drosophila Datasetsmentioning
confidence: 99%