2018
DOI: 10.1016/j.dib.2018.04.094
|View full text |Cite
|
Sign up to set email alerts
|

Genome-scale DNA sequence data and the evolutionary history of placental mammals

Abstract: We present a genomic data set comprised of the coding DNA sequences of 5162 loci from 90 vertebrate species, including 82 mammals. The loci were aligned with their protein sequences. The aligned protein sequences were then back translated into their original DNA sequences. The alignments were further filtered to remove individual sequences from each alignment exhibiting long branches or other unusual features. The data is deposited in figshare (http://figshare.com/articles/cds_5162.zip/6031190) and will be use… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
6
3

Relationship

2
7

Authors

Journals

citations
Cited by 20 publications
(15 citation statements)
references
References 8 publications
0
15
0
Order By: Relevance
“…In many cases, it was difficult to determine whether these alignments had been rigorously curated, and even more challenging to find datasets for which the root position of a number of subclades could be assumed with confidence. The only dataset that met all of our criteria was a dataset of placental mammals with 78 ingroup taxa and 3,050,199 amino acids (Wu, et al 2019). This dataset was originally published as an MSA (Liu, et al 2017) based on very high-quality sequences from Ensembl, NCBI, and GenBank databases.…”
Section: Empirical Datasetsmentioning
confidence: 99%
“…In many cases, it was difficult to determine whether these alignments had been rigorously curated, and even more challenging to find datasets for which the root position of a number of subclades could be assumed with confidence. The only dataset that met all of our criteria was a dataset of placental mammals with 78 ingroup taxa and 3,050,199 amino acids (Wu, et al 2019). This dataset was originally published as an MSA (Liu, et al 2017) based on very high-quality sequences from Ensembl, NCBI, and GenBank databases.…”
Section: Empirical Datasetsmentioning
confidence: 99%
“…These exchanges and issues prompted us to explore in a general way the effects of alignment uncertainty and priors on a large phylogenomic data set in mammals, a useful test group with a history of coalescent analyses on large and diverse data sets [4244]. A larger and improved set of alignments of [34] based on careful codon-based alignment and a state-of-the-art trimming pipeline is now available [45]. The current study comprehensively analyzes this data set of 5162 loci (total alignment length 9,150,597-14,623,557 bp) and 90 species [45] to evaluate the effects of alignment uncertainty, substitution model, and fossil priors on gene tree, species tree, and divergence time estimation in mammals.…”
Section: Introductionmentioning
confidence: 99%
“…A larger and improved set of alignments of [34] based on careful codon-based alignment and a state-of-the-art trimming pipeline is now available [45]. The current study comprehensively analyzes this data set of 5162 loci (total alignment length 9,150,597-14,623,557 bp) and 90 species [45] to evaluate the effects of alignment uncertainty, substitution model, and fossil priors on gene tree, species tree, and divergence time estimation in mammals.…”
Section: Introductionmentioning
confidence: 99%
“…After data reduction, the data sets contained alignments of 36 (A itken et al . 2017) to 4709 (W u et al . 2018) loci, each with 10 species (Table 1).…”
Section: Methodsmentioning
confidence: 99%