2019
DOI: 10.1111/zsc.12335
|View full text |Cite
|
Sign up to set email alerts
|

Causes and analytical impacts of missing data in RADseq phylogenetics: Insights from an African frog (Afrixalus)

Abstract: Restriction site-associated DNA sequencing (RADseq) has emerged as a useful tool in systematics and population genomics. A common feature of RADseq data sets is that they contain missing data that arise from multiple sources including genealogical sampling bias, assembly methodology and sequencing error. Many RADseq studies have demonstrated that allowing sites (single nucleotide polymorphisms, SNPs) with missing data can increase support for phylogenetic hypotheses. Two non-mutually exclusive explanations for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
31
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 38 publications
(33 citation statements)
references
References 71 publications
2
31
0
Order By: Relevance
“…A correlation between sample size (i.e. number of sampled populations per region) and phylogenetic resolution has been demonstrated in other RADseq-based phylogenetic studies 47 , 48 and for several AFLP-inferred population genetic measures 49 . Small sample sizes and missing data might not be as much a problem for RADseq as for AFLP.…”
Section: Discussionmentioning
confidence: 72%
“…A correlation between sample size (i.e. number of sampled populations per region) and phylogenetic resolution has been demonstrated in other RADseq-based phylogenetic studies 47 , 48 and for several AFLP-inferred population genetic measures 49 . Small sample sizes and missing data might not be as much a problem for RADseq as for AFLP.…”
Section: Discussionmentioning
confidence: 72%
“…Missing data is especially problematic in RADseq as it can lead to erroneous inference of population-genetic parameters ( Arnold et al, 2013 ; Gautier et al, 2013 ; Hodel et al, 2017 ). However, applying stringent filtering for missing data has been shown to prune parsimonious-informative loci, with best-practices suggesting non-conservative pruning of the data ( Huang & Lacey Knowles, 2016 ; Lee et al, 2018 ; Crotti et al, 2019 ). To mitigate missing data while avoiding stringent filtering, we applied a novel procedure, which allowed us to retrieve more loci from our data ( Cerca et al, 2021 ).…”
Section: Methodsmentioning
confidence: 99%
“…Rates of allele dropout are thereby expected to be correlated with the divergence between lineages (Crotti et al., 2019; Eaton et al., 2017; O'Leary et al., 2018). However, allele dropout may also result from artefacts in the experimental design, such as sampling bias and low sequence coverage; or from problems associated with library preparation, such as issues with enzyme digestion or size selection, and human error; or from challenges in DNA extraction since, for some organismal groups, extracting DNA may still be non‐trivial due to their reduced size or presence of chemical compounds which may interfere with the extraction; or from artefacts from bioinformatic analyses, such as problems associated with clustering of sequencing reads (Crotti et al., 2019; O'Leary et al., 2018). In fact, allele dropout originating from these technical artefacts can sometimes exceed dropout of biological origin under certain experimental conditions (Rivera‐Colón et al., 2020).…”
Section: Introductionmentioning
confidence: 99%
“…In fact, allele dropout originating from these technical artefacts can sometimes exceed dropout of biological origin under certain experimental conditions (Rivera‐Colón et al., 2020). Whatever the case may be, high allele dropout translates to high rates of missing data in the dataset, which may dramatically influence allele frequency in the dataset (Arnold et al., 2013; Gautier et al., 2013; Hodel et al., 2017), or phylogenetic reconstruction (Crotti et al., 2019; Eaton et al., 2017).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation