Metabarcoding is by now a well‐established method for biodiversity assessment in terrestrial, freshwater, and marine environments. Metabarcoding data sets are usually used for α‐ and β‐diversity estimates, that is, interspecies (or inter‐MOTU [molecular operational taxonomic unit]) patterns. However, the use of hypervariable metabarcoding markers may provide an enormous amount of intraspecies (intra‐MOTU) information—mostly untapped so far. The use of cytochrome oxidase (COI) amplicons is gaining momentum in metabarcoding studies targeting eukaryote richness. COI has been for a long time the marker of choice in population genetics and phylogeographic studies. Therefore, COI metabarcoding data sets may be used to study intraspecies patterns and phylogeographic features for hundreds of species simultaneously, opening a new field that we suggest to name metaphylogeography. The main challenge for the implementation of this approach is the separation of erroneous sequences from true intra‐MOTU variation. Here, we develop a cleaning protocol based on changes in entropy of the different codon positions of the COI sequence, together with co‐occurrence patterns of sequences. Using a data set of community DNA from several benthic littoral communities in the Mediterranean and Atlantic seas, we first tested by simulation on a subset of sequences a two‐step cleaning approach consisting of a denoising step followed by a minimal abundance filtering. The procedure was then applied to the whole data set. We obtained a total of 563 MOTUs that were usable for phylogeographic inference. We used semiquantitative rank data instead of read abundances to perform AMOVAs and haplotype networks. Genetic variability was mainly concentrated within samples, but with an important between seas component as well. There were intergroup differences in the amount of variability between and within communities in each sea. For two species, the results could be compared with traditional Sanger sequence data available for the same zones, giving similar patterns. Our study shows that metabarcoding data can be used to infer intra‐ and interpopulation genetic variability of many species at a time, providing a new method with great potential for basic biogeography, connectivity and dispersal studies, and for the more applied fields of conservation genetics, invasion genetics, and design of protected areas.
Background The recent blooming of metabarcoding applications to biodiversity studies comes with some relevant methodological debates. One such issue concerns the treatment of reads by denoising or by clustering methods, which have been wrongly presented as alternatives. It has also been suggested that denoised sequence variants should replace clusters as the basic unit of metabarcoding analyses, missing the fact that sequence clusters are a proxy for species-level entities, the basic unit in biodiversity studies. We argue here that methods developed and tested for ribosomal markers have been uncritically applied to highly variable markers such as cytochrome oxidase I (COI) without conceptual or operational (e.g., parameter setting) adjustment. COI has a naturally high intraspecies variability that should be assessed and reported, as it is a source of highly valuable information. We contend that denoising and clustering are not alternatives. Rather, they are complementary and both should be used together in COI metabarcoding pipelines. Results Using a COI dataset from benthic marine communities, we compared two denoising procedures (based on the UNOISE3 and the DADA2 algorithms), set suitable parameters for denoising and clustering, and applied these steps in different orders. Our results indicated that the UNOISE3 algorithm preserved a higher intra-cluster variability. We introduce the program DnoisE to implement the UNOISE3 algorithm taking into account the natural variability (measured as entropy) of each codon position in protein-coding genes. This correction increased the number of sequences retained by 88%. The order of the steps (denoising and clustering) had little influence on the final outcome. Conclusions We highlight the need for combining denoising and clustering, with adequate choice of stringency parameters, in COI metabarcoding. We present a program that uses the coding properties of this marker to improve the denoising step. We recommend researchers to report their results in terms of both denoised sequences (a proxy for haplotypes) and clusters formed (a proxy for species), and to avoid collapsing the sequences of the latter into a single representative. This will allow studies at the cluster (ideally equating species-level diversity) and at the intra-cluster level, and will ease additivity and comparability between studies.
In the marine realm, biomonitoring using eDNA of benthic communities requires destructive direct sampling or the setting-up of settlement structures. Comparatively much less effort is required to sample the water column, which can be accessed remotely. In this study we assess the feasibility of obtaining information from the eukaryotic benthic communities by sampling the adjacent water layer. We studied two different rocky-substrate benthic Accepted Article This article is protected by copyright. All rights reserved communities with a technique based on quadrat sampling. We also took replicate water samples at four distances (0, 0.5, 1.5, and 20 m) from the benthic habitat. Using broad range primers to amplify a ca. 313 bp fragment of the cytochrome oxidase subunit I gene, we obtained a total of 3,543 molecular operational taxonomic units (MOTUs). The structure obtained in the two environments was markedly different, with Metazoa, Archaeplastida and Stramenopiles being the most diverse groups in benthic samples, and Hacrobia, Metazoa and Alveolata in the water. Only 265 MOTUs (7.5%) were shared between benthos and water samples and, of these, 180 (5.1%) were identified as benthic taxa that left their DNA in the water. Most of them were found immediately adjacent to the benthos, and their number decreased as we moved apart from the benthic habitat. It was concluded that water eDNA, even in the close vicinity of the benthos, was a poor proxy for the analysis of benthic structure, and that direct sampling methods are required for monitoring these complex communities via metabarcoding.
Sponges have recently been proposed as ideal candidates to act as natural samplers for environmental DNA due to their efficiency in filtering water. However, validation of the usefulness of DNA recovered from sponges to reveal vertebrate biodiversity patterns in Marine Protected Areas is still needed. Additionally, nothing is known about how different sponge species and morphologies influence the capture of environmental DNA and whether biodiversity patterns obtained from sponges are best described by quantitative or qualitative measures. In this study, we amplified and sequenced a vertebrate specific 12S barcode with a set of universal PCR primers (MiFish) for metabarcoding environmental DNA from fishes, to unveil fine-scale patterns of fish communities from natural-sampler DNA retrieved from 64 sponges (16 species) located in eutrophic and well-preserved coral reefs in Nha Trang Bay (central Vietnam). Ninety tropical fish species were identified from the sponges, corresponding to one third of the total local ichthyofauna reported from previous extensive conventional surveys. Significant differentiation in fish communities between eutrophic and well-preserved environments was observed, albeit eutrophication only explained a modest proportion of the variation between fish communities. Differences in efficiency of capturing fish environmental DNA among sponge species or morphologies were not observed. Overall, the majority of detected fish species corresponded to reef-associated small-sized species, as expected in coral reefs environments. Remarkably, pelagic, migratory, and deep-sea fish species were also recovered from sponge tissues, pointing out the ability of sponge natural sampled DNA to detect fishes that were not permanently associated to the biomes where the sponges were sampled. These results highlight the suitability of natural samplers as a cost-effective way to assess vertebrate diversity in hyper-diverse environments.
The recent blooming of metabarcoding applications to biodiversity studies comes with some relevant methodological debates. One such issue concerns the treatment of reads by denoising or by clustering methods, which have been wrongly presented as alternatives. It has also been suggested that denoised sequence variants should replace clusters as the basic unit of metabarcoding analyses, missing the fact that sequence clusters are a proxy for species-level entities, the basic unit in biodiversity studies. We argue here that methods developed and tested for ribosomal markers have been uncritically applied to highly variable markers such as cytochrome oxidase I (COI) without conceptual or operational (e.g., parameter setting) adjustment. COI has a naturally high intraspecies variability that should be assessed and reported, as it is a source of highly valuable information. We contend that denoising and clustering are not alternatives. Rather, they are complementary and both should be used together in COI metabarcoding pipelines. Using a typical dataset from benthic marine communities, we compared two denoising procedures (based on the UNOISE3 and the DADA2 algorithms), set suitable parameters for denoising and clustering COI datasets, and compared the outcome of applying these processes in different orders. Our results indicate that denoising based on the UNOISE3 algorithm preserves a higher intra-cluster variability. We suggest and test ways to improve this algorithm taking into account the natural variability of each codon position in coding genes. The order of the steps (denoising and clustering) has little influence on the final outcome. We recommend researchers to consider reporting their results in terms of both denoised sequences (a proxy for haplotypes) and clusters formed (a proxy for species), and to avoid collapsing the sequences of the latter into a single representative. This will allow studies at the cluster (ideally equating species-level diversity) and at the intra-cluster level, and will ease additivity and comparability between studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.