MicroRNAs (miRNAs) are short regulatory RNAs processed from partially self-complementary foldbacks within longer MIRNA primary transcripts. Several MIRNA families are conserved deeply through land plants, but many are present only in closely related species or are species specific. The finding of numerous evolutionarily young MIRNA, many with low expression and few if any targets, supports a rapid birth-death model for MIRNA evolution. A systematic analysis of MIRNA genes and families in the close relatives, Arabidopsis thaliana and Arabidopsis lyrata, was conducted using both wholegenome comparisons and high-throughput sequencing of small RNAs. Orthologs of 143 A. thaliana MIRNA genes were identified in A. lyrata, with nine having significant sequence or processing changes that likely alter function. In addition, at least 13% of MIRNA genes in each species are unique, despite their relatively recent speciation (;10 million years ago). Alignment of MIRNA foldbacks to the Arabidopsis genomes revealed evidence for recent origins of 32 families by inverted or direct duplication of mostly protein-coding gene sequences, but less than half of these yield miRNA that are predicted to target transcripts from the originating gene family. miRNA nucleotide divergence between A. lyrata and A. thaliana orthologs was higher for young MIRNA genes, consistent with reduced purifying selection compared with deeply conserved MIRNA genes. Additionally, target sites of younger miRNA were lost more frequently than for deeply conserved families. In summary, our systematic analyses emphasize the dynamic nature of the MIRNA complement of plant genomes.
The advent of high-throughput sequencing (HTS) methods has enabled direct approaches to quantitatively profile small RNA populations. However, these methods have been limited by several factors, including representational artifacts and lack of established statistical methods of analysis. Furthermore, massive HTS data sets present new problems related to data processing and mapping to a reference genome. Here, we show that cluster-based sequencing-by-synthesis technology is highly reproducible as a quantitative profiling tool for several classes of small RNA from Arabidopsis thaliana. We introduce the use of synthetic RNA oligoribonucleotide standards to facilitate objective normalization between HTS data sets, and adapt microarray-type methods for statistical analysis of multiple samples. These methods were tested successfully using mutants with small RNA biogenesis (miRNA-defective dcl1 mutant and siRNA-defective dcl2 dcl3 dcl4 triple mutant) or effector protein (ago1 mutant) deficiencies. Computational methods were also developed to rapidly and accurately parse, quantify, and map small RNA data.
In this paper we explore high-throughput Illumina sequencing of nuclear protein-coding, ribosomal, and mitochondrial genes in small, dried insects stored in natural history collections. We sequenced one tenebrionid beetle and 12 carabid beetles ranging in size from 3.7 to 9.7 mm in length that have been stored in various museums for 4 to 84 years. Although we chose a number of old, small specimens for which we expected low sequence recovery, we successfully recovered at least some low-copy nuclear protein-coding genes from all specimens. For example, in one 56-year-old beetle, 4.4 mm in length, our de novo assembly recovered about 63% of approximately 41,900 nucleotides in a target suite of 67 nuclear protein-coding gene fragments, and 70% using a reference-based assembly. Even in the least successfully sequenced carabid specimen, reference-based assembly yielded fragments that were at least 50% of the target length for 34 of 67 nuclear protein-coding gene fragments. Exploration of alternative references for reference-based assembly revealed few signs of bias created by the reference. For all specimens we recovered almost complete copies of ribosomal and mitochondrial genes. We verified the general accuracy of the sequences through comparisons with sequences obtained from PCR and Sanger sequencing, including of conspecific, fresh specimens, and through phylogenetic analysis that tested the placement of sequences in predicted regions. A few possible inaccuracies in the sequences were detected, but these rarely affected the phylogenetic placement of the samples. Although our sample sizes are low, an exploratory regression study suggests that the dominant factor in predicting success at recovering nuclear protein-coding genes is a high number of Illumina reads, with success at PCR of COI and killing by immersion in ethanol being secondary factors; in analyses of only high-read samples, the primary significant explanatory variable was body length, with small beetles being more successfully sequenced.
Wastewater-based epidemiology uses pooled wastewater samples to monitor community health and has been used extensively during the COVID-19 pandemic to track SARS-CoV-2 RNA shed by infected individuals into wastewater. Wastewater concentrations of SARS-CoV-2 RNA have been positively correlated with contemporaneous counts of COVID-19 cases, making it useful for following relative disease burden trends within a community. However, the statistical associations are too weak for wastewater-based epidemiology to reliably predict reported case counts, limiting its potential. Here we show that wastewater SARS-CoV-2 concentrations are highly correlated with the community prevalence estimated from 8 randomized household community surveys in 6 Oregon communities over a 10-month period. We found that wastewater-based epidemiology is a significantly better predictor of COVID-19 community prevalence than reported case counts, which suffer from systematic biases including variations in access to testing and underreporting of asymptomatic cases, even after accounting for uncertainty inherent in the wastewater and prevalence estimates by using Monte Carlo simulations. Additionally, our results show that wastewater-based epidemiology can identify the rise and fall of neighborhood-scale COVID-19 hot spots and provide rapid information about the presence of SARS-CoV-2 variants at the neighborhood- and city-scale through sequence analyses of the wastewater. These results validate the potential of wastewater-based epidemiology to be a quantitative method to predict the prevalence of SARS-CoV-2 and identify the presence of variants of concern in a given community or neighborhood, independent of availability and access to individual-level testing. These advantages in combination with its scalability, relatively modest cost and low labor requirements, makes integrating permanent wastewater-based epidemiology infrastructure into public health systems a key component in creating pandemic-resilient cities in the future.
Metagenomic library preparation methods and sequencing technologies continue to advance rapidly, allowing researchers to characterize microbial communities in previously underexplored environmental samples and systems. However, widely accepted standardized library preparation methods can be cost-prohibitive.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.