Sequencing technologies using nucleotide conversion techniques such as cytosine to thymine in bisulfite-seq and thymine to cytosine in SLAM seq are powerful tools to explore the chemical intricacies of cellular processes. To date, no one has developed a unified methodology for aligning converted sequences and consolidating alignment of these technologies in one package. In this paper, we describe hierarchical indexing for spliced alignment of transcripts–3 nucleotides (HISAT-3N), which can rapidly and accurately align sequences consisting of any nucleotide conversion by leveraging the powerful hierarchical index and repeat index algorithms originally developed for the HISAT software. Tests on real and simulated data sets show that HISAT-3N is faster than other modern systems, with greater alignment accuracy, higher scalability, and smaller memory requirements. HISAT-3N therefore becomes an ideal aligner when used with converted sequence technologies.
We have established a protocol that allows for parallel quantification of three carbohydrate pools in the marine diatom Phaeodactylum tricornutum. This method utilizes a series of extraction and digestion steps followed by the employment of the 3-methyl-2-benzothiazolinone hydrazone (MBTH) reducing sugar assay. Comparing carbohydrate content between hydrolyzed and nonhydrolyzed soluble extracts enables quantification of soluble, nonreducing carbohydrate. The latter fraction contains chrysolaminarin as verified by 1 H-NMR and monomer composition of the purified glucan. We applied this method to investigate carbon partitioning in two experiments. We observed the accumulation of chrysolaminarin during the day and near complete consumption in the dark, supporting its role for fueling heterotrophic metabolism at night. We then observed little change in chrysolaminarin accumulation or consumption during nitrogen starvation, a condition that is known to increase the cellular content of the biofuel precursor triacylglycerol. Overall, this method improves the resolution of major carbohydrate pools in
Motivation With the vast improvements in sequencing technologies and increased number of protocols, sequencing is being used to answer complex biological problems. Subsequently, analysis pipelines have become more time consuming and complicated, usually requiring highly extensive pre-validation steps. Here we present SeqWho, a program designed to assess heuristically the quality of sequencing files and reliably classify the organism and protocol type by using Random Forest classifiers trained on biases native in k-mer frequencies and repeat sequence identities. Results Using one of our primary models, we show that our method accurately and rapidly classifies human and mouse sequences from nine different sequencing libraries by species, library, and both together, 98.32%, 97.86%, and 96.38% of the time respectively. Ultimately, we demonstrate that SeqWho is a powerful method for reliably validating the quality and identity of the sequencing files used in any pipeline. Availability https://github.com/DaehwanKimLab/seqwho Supplementary information Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.