A crucial step in the molecular detection of viruses in clinical specimens is the efficient extraction of viral nucleic acids. The total yield of viral nucleic acid from a clinical specimen is dependent on the specimen's volume, the initial virus concentration and the effectiveness provided by the extraction method. Recent Next Generation Sequencing (NGS)-based diagnostic approaches (i.e. metagenomics) provide a molecular 'open view' into the sample, as they theoretically generate sequence reads of any nucleic acid present in a specimen in a statistically representative manner. However, since a higher virus-related read output promises better sensitivity in the subsequent bioinformatic analysis, the extraction method selected determines the reliability of diagnostic NGS. In this study nine commercially available kits for nucleic acid extraction were compared regarding the simultaneous isolation of DNA and RNA by real-time PCR,four of which were selected for subsequent comparison by NGS (QIAamp Viral RNA Mini Kit, QIAamp DNA Blood Mini Kit, QIAamp cador Pathogen Mini Kit and QIAamp MinElute Virus Spin Kit). The nucleic acid yields and the sequence read output were compared for four different model viruses comprising Reovirus, Orthomyxovirus, Orthopoxvirus and Paramyxovirus, each at defined but varying concentrations in the same sample. The total amount of nucleic acid was processed to sequence the RNA (as cDNA) and the DNA with quantification by Qubit and virus-specific quantitative real-time PCRs. NGS libraries were prepared for sequencing on the Illumina HiSeq 1500 system. Finally, the percentage of reads assignable to each virus was determined via mapping. Evaluation of different commercial nucleic acid extraction kits with four different viruses indicates little variation in the read numbers obtained for transcribed RNA or DNA by NGS. Since NGSis increasingly being used as a tool in diagnostics of infectious diseases, the individual steps of the complete process have to be validated carefully. Here we could show that for virus identification in liquid clinical specimens, any nucleic acid extraction kit that is performing well for PCR diagnostics can be used for NGS diagnostics as well and that the selection of the kit has only a minor impact on the yield of viral reads.
MotivationNext generation sequencing (NGS) has provided researchers with a powerful tool to characterize metagenomic and clinical samples in research and diagnostic settings. NGS allows an open view into samples useful for pathogen detection in an unbiased fashion and without prior hypothesis about possible causative agents. However, NGS datasets for pathogen detection come with different obstacles, such as a very unfavorable ratio of pathogen to host reads. Alongside often appearing false positives and irrelevant organisms, such as contaminants, tools are often challenged by samples with low pathogen loads and might not report organisms present below a certain threshold. Furthermore, some metagenomic profiling tools are only focused on one particular set of pathogens, for example bacteria.ResultsWe present PAIPline, a bioinformatics pipeline specifically designed to address problems associated with detecting pathogens in diagnostic samples. PAIPline particularly focuses on userfriendliness and encapsulates all necessary steps from preprocessing to resolution of ambiguous reads and filtering up to visualization in a single tool. In contrast to existing tools, PAIPline is more specific while maintaining sensitivity. This is shown in a comparative evaluation where PAIPline was benchmarked along other well-known metagenomic profiling tools on previously published well-characterized datasets. Additionally, as part of an international cooperation project, PAIPline was applied to an outbreak sample of hemorrhagic fevers of then unknown etiology. The presented results show that PAIPline can serve as a robust, reliable, user-friendly, adaptable and generalizable stand-alone software for diagnostics from NGS samples and as a stepping stone for further downstream analyses.Availability and implementationPAIPline is freely available under https://gitlab.com/rki_bioinformatics/paipline.
Over the past years, NGS has been applied in time critical applications such as pathogen diagnostics with promising results. Yet, long turnaround times have to be accepted to generate sufficient data, as the analysis can only be performed sequentially after the sequencing has finished. Additionally, the interpretation of results can be further complicated by various types of contaminations, clinically irrelevant sequences, and the sheer amount and complexity of the data. We designed and implemented PathoLive, a real-time diagnostics pipeline which allows the detection of pathogens from clinical samples up to several days before the sequencing procedure is even finished and currently available tools may start to run. We adapted the core algorithm of HiLive, a real-time read mapper, and enhanced its accuracy for our use case. Furthermore, common contaminations, low-entropy areas, and sequences of widespread, nonpathogenic organisms are automatically marked beforehand using NGS datasets from healthy humans as a baseline. The results are visualized in an interactive taxonomic tree that provides an intuitive overview and detailed measures regarding the relevance of each identified potential pathogen. We applied the pipeline on a human plasma sample that was spiked in vitro with vaccinia virus, yellow fever virus, mumps virus, Rift Valley fever virus, adenovirus, and mammalian orthoreovirus. The sample was then sequenced on an Illumina HiSeq. All spiked agents were detected after the completion of only 12% of the sequencing procedure and were ranked more accurately throughout the run than by any of the tested tools on the complete data. We also found a large number of other sequences and these were correctly marked as clinically irrelevant in the resulting visualization. This tagging allows the user to obtain the correct assessment of the situation at first glance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.