Proksee (https://proksee.ca) provides users with a powerful, easy-to-use, and feature-rich system for assembling, annotating, analysing, and visualizing bacterial genomes. Proksee accepts Illumina sequence reads as compressed FASTQ files or pre-assembled contigs in raw, FASTA, or GenBank format. Alternatively, users can supply a GenBank accession or a previously generated Proksee map in JSON format. Proksee then performs assembly (for raw sequence data), generates a graphical map, and provides an interface for customizing the map and launching further analysis jobs. Notable features of Proksee include unique and informative assembly metrics provided via a custom reference database of assemblies; a deeply integrated high-performance genome browser for viewing and comparing analysis results at individual base resolution (developed specifically for Proksee); an ever-growing list of embedded analysis tools whose results can be seamlessly added to the map or searched and explored in other formats; and the option to export graphical maps, analysis results, and log files for data sharing and research reproducibility. All these features are provided via a carefully designed multi-server cloud-based system that can easily scale to meet user demand and that ensures the web server is robust and responsive.
IntroductionNext‐generation sequencing (NGS) has several advantages over conventional Sanger sequencing for HIV drug resistance (HIVDR) genotyping, including detection and quantitation of low‐abundance variants bearing drug resistance mutations (DRMs). However, the high HIV genomic diversity, unprecedented large volume of data, complexity of analysis and potential for error pose significant challenges for data processing. Several NGS analysis pipelines have been developed and used in HIVDR research; however, the absence of uniformity in data processing strategies results in lack of consistency and comparability of outputs from different pipelines. To fill this gap, an international symposium on bioinformatic strategies for NGS‐based HIVDR testing was held in February 2018 in Winnipeg, Canada, convening laboratory scientists, bioinformaticians and clinicians involved in four recently developed, publicly available NGS HIVDR pipelines. The goal of this symposium was to establish a consensus on effective bioinformatic strategies for NGS data management and its use for HIVDR reporting.DiscussionEssential functionalities of an NGS HIVDR pipeline were divided into five analytic blocks: (1) NGS read quality control (QC)/quality assurance (QA); (2) NGS read alignment and reference mapping; (3) HIV variant calling and variant QC; (4) NGS HIVDR reporting; and (5) extended data applications and additional considerations for data management. The consensuses reached among the participants on all major aspects of these blocks are summarized here. They encompass not only recommended data management and analysis strategies, but also detailed bioinformatic approaches that help ensure accuracy of the derived HIVDR analysis outputs for both research and potential clinical use.ConclusionsWhile NGS is being adopted more broadly in HIVDR testing laboratories, data processing is often a bottleneck hindering its generalized application. The proposed standardization of NGS read QC/QA, read alignment and reference mapping, variant calling and QC, HIVDR reporting and relevant data management strategies in this “Winnipeg Consensus” may serve as a starting guideline for NGS HIVDR data processing that informs the refinement of existing pipelines and those yet to be developed. Moreover, the bioinformatic strategies presented here may apply more broadly to NGS data analysis of microbes harbouring significant genomic diversity.
Next generation sequencing (NGS) is a trending new standard for genotypic HIV-1 drug resistance (HIVDR) testing. Many NGS HIVDR data analysis pipelines have been independently developed, each with variable outputs and data management protocols. Standardization of such analytical methods and comparison of available pipelines are lacking, yet may impact subsequent HIVDR interpretation and other downstream applications. Here we compared the performance of five NGS HIVDR pipelines using proficiency panel samples from NIAID Virology Quality Assurance (VQA) program. Ten VQA panel specimens were genotyped by each of six international laboratories using their own in-house NGS assays. Raw NGS data were then processed using each of the five different pipelines including HyDRA, MiCall, PASeq, Hivmmer and DEEPGEN. All pipelines detected amino acid variants (AAVs) at full range of frequencies (1~100%) and demonstrated good linearity as compared to the reference frequency values. While the sensitivity in detecting low abundance AAVs, with frequencies between 1~20%, is less a concern for all pipelines, their specificity dramatically decreased at AAV frequencies <2%, suggesting that 2% threshold may be a more reliable reporting threshold for ensured specificity in AAV calling and reporting. More variations were observed among the pipelines when low abundance AAVs are concerned, likely due to differences in their NGS read quality control strategies. Findings from this study highlight the need for standardized strategies for NGS HIVDR data analysis, especially for the detection of minority HiVDR variants.Genotypic HIV drug resistance (HIVDR) testing not only guides effective clinical care of HIV-infected patients but also serves to provide surveillance of transmitted HIVDR in the population. Treatment guidelines in resource-permitted settings advocate the use of HIVDR monitoring both prior to ART initiation and when treatment failure is suspected 1,2 . There is increasing evidence showing that the presence of minority resistance variants open Scientific RepoRtS | (2020) 10:1634 | https://doi.org/10.1038/s41598-020-58544-z www.nature.com/scientificreports www.nature.com/scientificreports/ (MRV) in the HIV quasispecies (i.e., a swarm of highly-related but genotypically different viral variants) may be clinically significant and increase the risk of virological failure, impair immune recovery, lead to accumulation of drug resistance, increase risk of treatment switches and death [3][4][5][6][7][8] . A nationwide study in Mexico focusing on pretreatment drug resistance (PDR) found that lowering the detection threshold for PDR to 5% versus the conventional 20% improved the ability to identify patients with virological failure 6 . In addition, a European wide study found that pre-existing minority drug-resistant HIV-1 variants more than doubled the risk of virological failure to first-line NNRTI-based ART 9 . A more recent African study also reported similar findings, suggesting lowering the threshold below 20% improved the ability to i...
Conventional HIV drug resistance (HIVDR) genotyping utilizes Sanger sequencing (SS) methods, which are limited by low data throughput and the inability of detecting low abundant drug resistant variants (LADRVs). Here we present a next generation sequencing (NGS)-based HIVDR typing platform that leverages the advantages of Illumina MiSeq and HyDRA Web. The platform consists of a fully validated sample processing protocol and HyDRA web, an open web portal that allows automated customizable NGS-based HIVDR data processing. This platform was characterized and validated using a panel of HIV-spiked plasma representing all major HIV-1 subtypes, pedigreed plasmids, HIVDR proficiency specimens and clinical specimens. All examined major HIV-1 subtypes were consistently amplified at viral loads of ≥1,000 copies/ml. The gross error rate of this platform was determined at 0.21%, and minor variations were reliably detected down to 0.50% in plasmid mixtures. All HIVDR mutations identifiable by SS were detected by the MiSeq-HyDRA protocol, while LADRVs at frequencies of 1~15% were detected by MiSeq-HyDRA only. As compared to SS approaches, the MiSeq-HyDRA platform has several notable advantages including reduced cost and labour, and increased sensitivity for LADRVs, making it suitable for routine HIVDR monitoring for both patient care and surveillance purposes.
Whole genome sequencing (WGS) is a powerful tool for public health infectious disease investigations owing to its higher resolution, greater efficiency, and cost-effectiveness over traditional genotyping methods. Implementation of WGS in routine public health microbiology laboratories is impeded by a lack of user-friendly automated and semi-automated pipelines, restrictive jurisdictional data sharing policies, and the proliferation of non-interoperable analytical and reporting systems. To address these issues, we developed the Integrated Rapid Infectious Disease Analysis (IRIDA) platform (irida.ca), a user-friendly, decentralized, open-source bioinformatics and analytical web platform to support real-time infectious disease outbreak investigations using WGS data. Instances can be independently installed on local highperformance computing infrastructure, enabling private and secure data management and analyses according to organizational policies and governance. IRIDA's data management capabilities enable secure upload, storage and sharing of all WGS data and metadata. The core platform currently includes pipelines for quality control, assembly, annotation, variant detection, phylogenetic analysis, in silico serotyping, multi-locus sequence typing, and genome distance calculation. Analysis pipeline results can be visualized within the platform through dynamic line lists and integrated phylogenomic clustering for research and discovery, and for enhancing decision-making support and hypothesis generation in epidemiological investigations. Communication and data exchange between instances are provided through customizable access controls. IRIDA complements centralized systems, empowering local analytics and visualizations for genomics-based microbial pathogen investigations. IRIDA is currently transforming the Canadian public health ecosystem and is freely available at https://github.com/phac-nml/irida and www.irida.ca. Impact StatementWhole genome sequencing (WGS) is revolutionizing infectious disease analysis and surveillance due to its cost effectiveness, utility, and improved analytical power. To date, no . CC-BY-NC-ND 4.0 International license It is made available under a was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.The copyright holder for this preprint (which . http://dx.doi.org/10.1101/381830 doi: bioRxiv preprint first posted online Jul. 31, 2018; 3 "one-size-fits-all" genomics platform has been universally adopted, owing to differences in national (and regional) health information systems, data sharing policies, computational infrastructures, lack of interoperability and prohibitive costs. The Integrated Rapid Infectious Disease Analysis (IRIDA) platform is a user-friendly, decentralized, open-source bioinformatics and analytical web platform developed to support real-time infectious disease outbreak investigations using WGS data. IRIDA empowers public health, regulatory and clinical microbiology laboratory personnel to bett...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.