2023
DOI: 10.1093/sysbio/syad036
|View full text |Cite
|
Sign up to set email alerts
|

DNA Sequences Are as Useful as Protein Sequences for Inferring Deep Phylogenies

Abstract: Inference of deep phylogenies has almost exclusively used protein rather than DNA sequences, based on the perception that protein sequences are less prone to homoplasy and saturation or to issues of compositional heterogeneity than DNA sequences. Here we analyze a model of codon evolution under an idealized genetic code and demonstrate that those perceptions may be misconceptions. We conduct a simulation study to assess the utility of protein versus DNA sequences for inferring deep phylogenies, with protein-co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 97 publications
(178 reference statements)
0
4
0
Order By: Relevance
“…Furthermore, significant differences in amino acid composition were identified between sequences linked to fish and those associated with tetrapods (Table S1). When the phylogram was based on methods that are less susceptible to LBA[32], (i.e., nucleotides with a partitioned codon model including all or only first and second positions), TAPV grouped with the sequences from genomes with a shared architecture, the mammalian MARV-like clade, instead of with the fish-associated virus, XILV (Fig 1B; Figs S1-S2). Another method that can reduce LBA due to amino acid site heterogeneity, PMSF (Posterior Mean Site Frequency profiles analysis [33]), found the same tree as the putative LBA phylogram (Fig S3).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Furthermore, significant differences in amino acid composition were identified between sequences linked to fish and those associated with tetrapods (Table S1). When the phylogram was based on methods that are less susceptible to LBA[32], (i.e., nucleotides with a partitioned codon model including all or only first and second positions), TAPV grouped with the sequences from genomes with a shared architecture, the mammalian MARV-like clade, instead of with the fish-associated virus, XILV (Fig 1B; Figs S1-S2). Another method that can reduce LBA due to amino acid site heterogeneity, PMSF (Posterior Mean Site Frequency profiles analysis [33]), found the same tree as the putative LBA phylogram (Fig S3).…”
Section: Resultsmentioning
confidence: 99%
“…We recommend that evolutionary analyses of divergent RNA viruses include codon-partitioned models. Such data uses the same alignment hypothesis as for amino acids but have three times as much characters and less susceptibility to LBA [32]. Adding divergent outgroups can reduce ingroup LBA (as occurred here with the L protein), but also introduce new sources of bias and LBA [35].…”
Section: Discussionmentioning
confidence: 99%
“…For deep evolutionary relationships, such as the origin of eukaryotes or the diversification of animals, it is common practice to use protein-based phylogenomic analyses (Burki et al, 2020; Simion et al, 2017). This is because protein sequences are much less prone to homoplasy compared to DNA sequences because of the slower rate of observable evolution and the larger alphabet (20 amino acids compared to 4 nucleotides) (Philippe et al, 2011, Kapli et al, 2023). This makes it helpful to use amino acid sequences rather than DNA sequences for ortholog and paralog determination.…”
Section: Introductionmentioning
confidence: 99%
“…This makes it helpful to use amino acid sequences rather than DNA sequences for ortholog and paralog determination. However, the usefulness of DNA sequence based phylogenetics for resolving relationships between closely related species is well documented, recent work suggests their utility in deep phylogenetics too (Kapli et al, 2023).…”
Section: Introductionmentioning
confidence: 99%