Environmental DNA (eDNA) metabarcoding is a promising method to monitor species and community diversity that is rapid, affordable and non‐invasive. The longstanding needs of the eDNA community are modular informatics tools, comprehensive and customizable reference databases, flexibility across high‐throughput sequencing platforms, fast multilocus metabarcode processing and accurate taxonomic assignment. Improvements in bioinformatics tools make addressing each of these demands within a single toolkit a reality. The new modular metabarcode sequence toolkit Anacapa ( https://github.com/limey-bean/Anacapa/) addresses the above needs, allowing users to build comprehensive reference databases and assign taxonomy to raw multilocus metabarcode sequence data. A novel aspect of Anacapa is its database building module, “Creating Reference libraries Using eXisting tools” (CRUX), which generates comprehensive reference databases for specific user‐defined metabarcoding loci. The Quality Control and ASV Parsing module sorts and processes multiple metabarcoding loci and processes merged, unmerged and unpaired reads maximizing recovered diversity. DADA2 then detects amplicon sequence variants (ASVs) and the Anacapa Classifier module aligns these ASVs to CRUX‐generated reference databases using Bowtie2. Lastly, taxonomy is assigned to ASVs with confidence scores using a Bayesian Lowest Common Ancestor (BLCA) method. The Anacapa Toolkit also includes an r package, ranacapa, for automated results exploration through standard biodiversity statistical analysis. Benchmarking tests verify that the Anacapa Toolkit effectively and efficiently generates comprehensive reference databases that capture taxonomic diversity, and can assign taxonomy to both MiSeq and HiSeq‐length sequence data. We demonstrate the value of the Anacapa Toolkit in assigning taxonomy to seawater eDNA samples collected in southern California. The Anacapa Toolkit improves the functionality of eDNA and streamlines biodiversity assessment and management by generating metabarcode specific databases, processing multilocus data, retaining a larger proportion of sequencing reads and expanding non‐traditional eDNA targets. All the components of the Anacapa Toolkit are open and available in a virtual container to ease installation.
are co-equal second authors.Robert Wayne and Rachel S. Meyer are co-equal senior authors. Abstract 1. Environmental DNA (eDNA) metabarcoding is a promising method to monitor species and community diversity that is rapid, affordable and non-invasive. The longstanding needs of the eDNA community are modular informatics tools, comprehensive and customizable reference databases, flexibility across high-throughput sequencing platforms, fast multilocus metabarcode processing and accurate taxonomic assignment. Improvements in bioinformatics tools make addressing each of these demands within a single toolkit a reality.2. The new modular metabarcode sequence toolkit Anacapa (https ://github.com/ limey-bean/Anaca pa/) addresses the above needs, allowing users to build comprehensive reference databases and assign taxonomy to raw multilocus metabarcode sequence data. A novel aspect of Anacapa is its database building module, "Creating Reference libraries Using eXisting tools" (CRUX), which generates comprehensive reference databases for specific user-defined metabarcoding loci. The Quality Control and ASV Parsing module sorts and processes multiple metabarcoding loci and processes merged, unmerged and unpaired reads maximizing recovered diversity. DADA2 then detects amplicon sequence variants (ASVs) and the Anacapa Classifier module aligns these ASVs to CRUX-generated reference databases using Bowtie2. Lastly, taxonomy is assigned to ASVs with confidence scores using a Bayesian Lowest Common Ancestor (BLCA) method. The Anacapa Toolkit also includes an r package, ranacapa, for automated results exploration through standard biodiversity statistical analysis.3. Benchmarking tests verify that the Anacapa Toolkit effectively and efficiently generates comprehensive reference databases that capture taxonomic diversity, and can assign taxonomy to both MiSeq and HiSeq-length sequence data. We demonstrate the value of the Anacapa Toolkit in assigning taxonomy to seawater eDNA samples collected in southern California.
The non-human primate reference transcriptome resource (NHPRTR, available online at http://nhprtr.org/) aims to generate comprehensive RNA-seq data from a wide variety of non-human primates (NHPs), from lemurs to hominids. In the 2012 Phase I of the NHPRTR project, 19 billion fragments or 3.8 terabases of transcriptome sequences were collected from pools of ∼20 tissues in 15 species and subspecies. Here we describe a major expansion of NHPRTR by adding 10.1 billion fragments of tissue-specific RNA-seq data. For this effort, we selected 11 of the original 15 NHP species and subspecies and constructed total RNA libraries for the same ∼15 tissues in each. The sequence quality is such that 88% of the reads align to human reference sequences, allowing us to compute the full list of expression abundance across all tissues for each species, using the reads mapped to human genes. This update also includes improved transcript annotations derived from RNA-seq data for rhesus and cynomolgus macaques, two of the most commonly used NHP models and additional RNA-seq data compiled from related projects. Together, these comprehensive reference transcriptomes from multiple primates serve as a valuable community resource for genome annotation, gene dynamics and comparative functional analysis.
RNA-based next-generation sequencing (RNA-Seq) provides a tremendous amount of new information regarding gene and transcript structure, expression and regulation. This is particularly true for non-coding RNAs where whole transcriptome analyses have revealed that the much of the genome is transcribed and that many non-coding transcripts have widespread functionality. However, uniform resources for raw, cleaned and processed RNA-Seq data are sparse for most organisms and this is especially true for non-human primates (NHPs). Here, we describe a large-scale RNA-Seq data and analysis infrastructure, the NHP reference transcriptome resource (http://nhprtr.org); it presently hosts data from12 species of primates, to be expanded to 15 species/subspecies spanning great apes, old world monkeys, new world monkeys and prosimians. Data are collected for each species using pools of RNA from comparable tissues. We provide data access in advance of its deposition at NCBI, as well as browsable tracks of alignments against the human genome using the UCSC genome browser. This resource will continue to host additional RNA-Seq data, alignments and assemblies as they are generated over the coming years and provide a key resource for the annotation of NHP genomes as well as informing primate studies on evolution, reproduction, infection, immunity and pharmacology.
Hormone signaling is often pulsatile, and multi-parameter deconvolution procedures have long been utilized to identify and characterize secretory events. However, the existing programs have serious limitations, including the subjective nature of initial peak selection, lack of statistical verification of presumed bursts, and user-unfriendliness of the application. Here, we describe a novel deconvolution program, AutoDecon, which addresses these concerns. We validate AutoDecon for application to serum luteinizing hormone (LH) concentration time series using synthetic data mimicking real data from normal women and then comparing the performance of AutoDecon to the performance of the widely-employed hormone pulsatility analysis program Cluster. The sensitivity of AutoDecon is higher than Cluster: ~96% vs. ~80% (p = 0.001). However, Cluster had a lower false-positive detection rate than AutoDecon: 6% vs 1%, p = 0.001. Further analysis demonstrated that the pulsatility parameters recovered by AutoDecon were indistinguishable from those characterizing the synthetic data and sampling at 5-or 10-minute intervals was optimal for maximizing the sensitivity rates for LH. Accordingly, AutoDecon presents a viable non-subjective alternative to previous pulse detection algorithms for the analysis of LH data. It is applicable to other pulsatile hormone-concentration time series and many other pulsatile phenomena. The software is free and downloadable at
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.