Advances in DNA sequencing technology have revolutionized the field of molecular analysis of trophic interactions, and it is now possible to recover counts of food DNA sequences from a wide range of dietary samples. But what do these counts mean? To obtain an accurate estimate of a consumer's diet should we work strictly with data sets summarizing frequency of occurrence of different food taxa, or is it possible to use relative number of sequences? Both approaches are applied to obtain semi-quantitative diet summaries, but occurrence data are often promoted as a more conservative and reliable option due to taxa-specific biases in recovery of sequences. We explore representative dietary metabarcoding data sets and point out that diet summaries based on occurrence data often overestimate the importance of food consumed in small quantities (potentially including low-level contaminants) and are sensitive to the count threshold used to define an occurrence. Our simulations indicate that using relative read abundance (RRA) information often provides a more accurate view of population-level diet even with moderate recovery biases incorporated; however, RRA summaries are sensitive to recovery biases impacting common diet taxa. Both approaches are more accurate when the mean number of food taxa in samples is small. The ideas presented here highlight the need to consider all sources of bias and to justify the methods used to interpret count data in dietary metabarcoding studies. We encourage researchers to continue addressing methodological challenges and acknowledge unanswered questions to help spur future investigations in this rapidly developing area of research.
Studies of insect assemblages are suited to the simultaneous DNA-based identification of multiple taxa known as metabarcoding. To obtain accurate estimates of diversity, metabarcoding markers ideally possess appropriate taxonomic coverage to avoid PCR-amplification bias, as well as sufficient sequence divergence to resolve species. We used in silico PCR to compare the taxonomic coverage and resolution of newly designed insect metabarcodes (targeting 16S) with that of existing markers [16S and cytochrome oxidase c subunit I (COI)] and then compared their efficiency in vitro. Existing metabarcoding primers amplified in silico <75% of insect species with complete mitochondrial genomes available, whereas new primers targeting 16S provided >90% coverage. Furthermore, metabarcodes targeting COI appeared to introduce taxonomic PCR-amplification bias, typically amplifying a greater percentage of Lepidoptera and Diptera species, while failing to amplify certain orders in silico. To test whether bias predicted in silico was observed in vitro, we created an artificial DNA blend containing equal amounts of DNA from 14 species, representing 11 insect orders and one arachnid. We PCR-amplified the blend using five primer sets, targeting either COI or 16S, with high-throughput amplicon sequencing yielding more than 6 million reads. In vitro results typically corresponded to in silico PCR predictions, with newly designed 16S primers detecting 11 insect taxa present, thus providing equivalent or better taxonomic coverage than COI metabarcodes. Our results demonstrate that in silico PCR is a useful tool for predicting taxonomic bias in mixed template PCR and that researchers should be wary of potential bias when selecting metabarcoding markers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.