BackgroundWhen compared to Sanger sequencing technology, next-generation sequencing (NGS) technologies are hindered by shorter sequence read length, higher base-call error rate, non-uniform coverage, and platform-specific sequencing artifacts. These characteristics lower the quality of their downstream analyses, e.g. de novo and reference-based assembly, by introducing sequencing artifacts and errors that may contribute to incorrect interpretation of data. Although many tools have been developed for quality control and pre-processing of NGS data, none of them provide flexible and comprehensive trimming options in conjunction with parallel processing to expedite pre-processing of large NGS datasets.MethodsWe developed ngsShoRT (next-generation sequencing Short Reads Trimmer), a flexible and comprehensive open-source software package written in Perl that provides a set of algorithms commonly used for pre-processing NGS short read sequences. We compared the features and performance of ngsShoRT with existing tools: CutAdapt, NGS QC Toolkit and Trimmomatic. We also compared the effects of using pre-processed short read sequences generated by different algorithms on de novo and reference-based assembly for three different genomes: Caenorhabditis elegans, Saccharomyces cerevisiae S288c, and Escherichia coli O157 H7.ResultsSeveral combinations of ngsShoRT algorithms were tested on publicly available Illumina GA II, HiSeq 2000, and MiSeq eukaryotic and bacteria genomic short read sequences with the focus on removing sequencing artifacts and low-quality reads and/or bases. Our results show that across three organisms and three sequencing platforms, trimming improved the mean quality scores of trimmed sequences. Using trimmed sequences for de novo and reference-based assembly improved assembly quality as well as assembler performance. In general, ngsShoRT outperformed comparable trimming tools in terms of trimming speed and improvement of de novo and reference-based assembly as measured by assembly contiguity and correctness.ConclusionsTrimming of short read sequences can improve the quality of de novo and reference-based assembly and assembler performance. The parallel processing capability of ngsShoRT reduces trimming time and improves the memory efficiency when dealing with large datasets. We recommend combining sequencing artifacts removal, and quality score based read filtering and base trimming as the most consistent method for improving sequence quality and downstream assemblies.ngsShoRT source code, user guide and tutorial are available at http://research.bioinformatics.udel.edu/genomics/ngsShoRT/. ngsShoRT can be incorporated as a pre-processing step in genome and transcriptome assembly projects.
IntroductionGenetic and molecular signatures have been incorporated into cancer prognosis prediction and treatment decisions with good success over the past decade. Clinically, these signatures are usually used in early-stage cancers to evaluate whether they require adjuvant therapy following surgical resection. A molecular signature that is prognostic across more clinical contexts would be a useful addition to current signatures.MethodsWe defined a signature for the ubiquitous tissue factor, E2F4, based on its shared target genes in multiple tissues. These target genes were identified by chromatin immunoprecipitation sequencing (ChIP-seq) experiments using a probabilistic method. We then computationally calculated the regulatory activity score (RAS) of E2F4 in cancer tissues, and examined how E2F4 RAS correlates with patient survival.ResultsGenes in our E2F4 signature were 21-fold more likely to be correlated with breast cancer patient survival time compared to randomly selected genes. Using eight independent breast cancer datasets containing over 1,900 unique samples, we stratified patients into low and high E2F4 RAS groups. E2F4 activity stratification was highly predictive of patient outcome, and our results remained robust even when controlling for many factors including patient age, tumor size, grade, estrogen receptor (ER) status, lymph node (LN) status, whether the patient received adjuvant therapy, and the patient’s other prognostic indices such as Adjuvant! and the Nottingham Prognostic Index scores. Furthermore, the fractions of samples with positive E2F4 RAS vary in different intrinsic breast cancer subtypes, consistent with the different survival profiles of these subtypes.ConclusionsWe defined a prognostic signature, the E2F4 regulatory activity score, and showed it to be significantly predictive of patient outcome in breast cancer regardless of treatment status and the states of many other clinicopathological variables. It can be used in conjunction with other breast cancer classification methods such as Oncotype DX to improve clinical outcome prediction.Electronic supplementary materialThe online version of this article (doi:10.1186/s13058-014-0486-7) contains supplementary material, which is available to authorized users.
Nephrolithiasis is strongly associated with prior pregnancies. Among women of reproductive age the odds of stones are greater than doubled in those who had been pregnant compared with those who had never been pregnant. Nephrolithiasis prevalence also increases with the increasing number of pregnancies. Future investigation and identification of modifiable risk factors among pregnant patients may allow for a reduction in the burden of stone disease in women.
Among adults of working and child rearing ages in the United States the much touted gender disparity in nephrolithiasis is not present. Prior assessments of gender based stone prevalence may have failed to specifically assess this economically critical demographic or there may in fact be an ongoing epidemiological change. Recognition that women are as likely as men to form stones in this cohort suggests the need to better elucidate the pathophysiology of stones in women.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.