Correct annotation metadata is critical for reproducible and accurate RNA-seq analysis. When files are shared publicly or among collaborators with incorrect or missing annotation metadata, it becomes difficult or impossible to reproduce bioinformatic analyses from raw data. It also makes it more difficult to locate the transcriptomic features, such as transcripts or genes, in their proper genomic context, which is necessary for overlapping expression data with other datasets. We provide a solution in the form of an R/Bioconductor package tximeta that performs numerous annotation and metadata gathering tasks automatically on behalf of users during the import of transcript quantification files. The correct reference transcriptome is identified via a hashed checksum stored in the quantification output, and key transcript databases are downloaded and cached locally. The computational paradigm of automatically adding annotation metadata based on reference sequence checksums can greatly facilitate genomic workflows, by helping to reduce overhead during bioinformatic analyses, preventing costly bioinformatic mistakes, and promoting computational reproducibility. The tximeta package is available at https://bioconductor.org/packages/tximeta.
Nonalcoholic fatty liver disease (NAFLD) is rapidly becoming the most common cause of chronic liver disease due to an increase in the prevalence of obesity. The development of NASH leads to an increase in morbidity and mortality. While the first line of treatment is lifestyle modifications, including dietary changes and increased physical activity, there are no approved pharmacological treatment agents for NAFLD and NASH currently. Due to its complex pathophysiology, different pathways are under investigation for drug development with the focus on metabolic pathways, inflammation, and slowing or reversing fibrosis. There are several agents advancing in clinical trials, and promising results have been seen with drugs that affect hepatic steatosis, inflammation, and fibrosis. This review will provide an overview on NAFLD and some of the mechanisms of disease that are being targeted with pharmacologic agents.
Recent studies have revealed repeated patterns of genomic divergence associated with species formation. Such patterns suggest that natural selection tends to target a set of available genes, but is also indicative that closely related taxa share evolutionary constraints that limit genetic variability. Studying patterns of genomic divergence among populations within the same species may shed light on the underlying evolutionary processes. Here, we examine transcriptome-wide divergence and polymorphism in the marine copepod Tigriopus californicus, a species where allopatric evolution has led to replicate sets of populations with varying degrees of divergence and hybrid incompatibility. Our analyses suggest that relatively small effective population sizes have resulted in an exponential decline of shared polymorphisms during population divergence and also facilitated the fixation of slightly deleterious mutations within allopatric populations. Five interpopulation comparisons at three different stages of divergence show that nonsynonymous mutations tend to accumulate in a specific set of proteins. These include proteins with central roles in cellular metabolism, such as those encoded in mtDNA, but also include an additional set of proteins that repeatedly show signatures of positive selection during allopatric divergence. Although our results are consistent with a contribution of nonadaptive processes, such as genetic drift and gene expression levels, generating repeatable patterns of genomic divergence in closely related taxa, they also indicate that adaptive evolution targeting a specific set of genes contributes to this pattern. Our results yield insights into the predictability of evolution at the gene level.
The sourmash software package uses MinHash-based sketching to create "signatures", compressed representations of DNA, RNA, and protein sequences, that can be stored, searched, explored, and taxonomically annotated. sourmash signatures can be used to estimate sequence similarity between very large data sets quickly and in low memory, and can be used to search large databases of genomes for matches to query genomes and metagenomes. sourmash is implemented in C++, Rust, and Python, and is freely available under the BSD license at http://github.com/dib-lab/sourmash. bioinformatics, sequence analysis, MinHash, k-mer, sourmash
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.