2021
DOI: 10.1101/2021.09.22.461252
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Using all gene families vastly expands data available for phylogenomic inference

Abstract: Traditionally, single-copy orthologs have been the gold standard in phylogenomics. Most phylogenomic studies identify putative single-copy orthologs by using clustering approaches and retaining families with a single sequence from each species. However, this approach can severely limit the amount of data available by excluding larger families. Recent methodological advances have suggested several ways to include data from larger families. For instance, tree-based decomposition methods facilitate the extraction… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

3
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 84 publications
3
2
0
Order By: Relevance
“…Inclusion of SNAP-OGs increased the size of all seven datasets, sometimes substantially. We note that our results are qualitatively similar to those reported recently by Smith et al [37], which retrieved SC-OGs nested within larger families from 26 primates and examined their performance in gene tree and species tree inference. Three noteworthy differences are that we also conduct species-specific inparalog trimming, provide a user-friendly command-line software for SNAP-OG identification, and evaluated the phylogenetic information content of SNAP-OGs and SC-OGs across seven diverse phylogenomic datasets.…”
Section: Discussionsupporting
confidence: 90%
See 2 more Smart Citations
“…Inclusion of SNAP-OGs increased the size of all seven datasets, sometimes substantially. We note that our results are qualitatively similar to those reported recently by Smith et al [37], which retrieved SC-OGs nested within larger families from 26 primates and examined their performance in gene tree and species tree inference. Three noteworthy differences are that we also conduct species-specific inparalog trimming, provide a user-friendly command-line software for SNAP-OG identification, and evaluated the phylogenetic information content of SNAP-OGs and SC-OGs across seven diverse phylogenomic datasets.…”
Section: Discussionsupporting
confidence: 90%
“…Notably, these software are also different from sequence similarity graph-based inferences of subgroups of single-copy orthologous genes—such as the algorithm implemented in OMA [21]. Finally, our results, together with other studies, demonstrate the utility of SC-OGs that are nested within larger families [15,20,37,38].…”
Section: Discussionmentioning
confidence: 58%
See 1 more Smart Citation
“…Inclusion of SNAP-OGs increased the size of all 7 datasets, sometimes substantially. We note that our results are qualitatively similar to those reported recently by Smith and colleagues [37], which retrieved SC-OGs nested within larger families from 26 primates and examined their performance in gene tree and species tree inference. Three noteworthy differences are that we also conduct species-specific inparalog trimming, provide a user-friendly command-line software for SNAP-OG identification, and evaluated the phylogenetic information content of SNAP-OGs and SC-OGs across 7 diverse phylogenomic datasets.…”
Section: Discussionsupporting
confidence: 90%
“…Moreover, examination of evolutionary histories facilitates the identification of species-specific inparalogs. Finally, our results, together with other studies, demonstrate the utility of SC-OGs that are nested within larger families [ 15 , 20 , 37 , 38 ].…”
Section: Discussionsupporting
confidence: 81%