BackgroundAnalysis of the viral genome for drug resistance mutations is state-of-the-art for guiding treatment selection for human immunodeficiency virus type 1 (HIV-1)-infected patients. These mutations alter the structure of viral target proteins and reduce or in the worst case completely inhibit the effect of antiretroviral compounds while maintaining the ability for effective replication. Modern anti-HIV-1 regimens comprise multiple drugs in order to prevent or at least delay the development of resistance mutations. However, commonly used HIV-1 genotype interpretation systems provide only classifications for single drugs. The EuResist initiative has collected data from about 18,500 patients to train three classifiers for predicting response to combination antiretroviral therapy, given the viral genotype and further information. In this work we compare different classifier fusion methods for combining the individual classifiers.Principal FindingsThe individual classifiers yielded similar performance, and all the combination approaches considered performed equally well. The gain in performance due to combining methods did not reach statistical significance compared to the single best individual classifier on the complete training set. However, on smaller training set sizes (200 to 1,600 instances compared to 2,700) the combination significantly outperformed the individual classifiers (p<0.01; paired one-sided Wilcoxon test). Together with a consistent reduction of the standard deviation compared to the individual prediction engines this shows a more robust behavior of the combined system. Moreover, using the combined system we were able to identify a class of therapy courses that led to a consistent underestimation (about 0.05 AUC) of the system performance. Discovery of these therapy courses is a further hint for the robustness of the combined system.ConclusionThe combined EuResist prediction engine is freely available at http://engine.euresist.org.
The EpiGRAPH web service http://epigraph.mpi-inf.mpg.de/ enables biologists to uncover hidden associations in vertebrate genome and epigenome datasets. Users can upload sets of genomic regions and EpiGRAPH will test multiple attributes (including DNA sequence, chromatin structure, epigenetic modifications and evolutionary conservation) for enrichment or depletion among these regions. Furthermore, EpiGRAPH learns to predictively identify similar genomic regions. This paper demonstrates EpiGRAPH's practical utility in a case study on monoallelic gene expression and describes its novel approach to reproducible bioinformatic analysis. RationaleEpiGRAPH addresses two tasks that are common in genome biology: discovering novel associations between a set of genomic regions with a specific biological role (for example, experimentally mapped enhancers, hotspots of epigenetic regulation or sites exhibiting disease-specific alterations) and the bulk of genome annotation data that are available from public databases; and assessing whether it is possible to predictively identify additional genomic regions with a similar role without the need for further wet-lab experiments.The increasing relevance of analyzing sets of genomic regions arises from technical innovations such as tiling microarrays and next-generation sequencing [1][2][3][4][5], which can be used to scan the genome for specific types of regions (for example, transcription factor binding sites or cancer-specific genomic alterations). The resulting datasets are difficult to analyze with existing toolkits for genomic data mining -such as GSEA [6] and DAVID [7] -because most existing tools are gene-centric and cannot easily account for genomic regions that are located outside of (protein-coding) genes. In the absence of a suitable tool for statistical analysis and prediction of genomic region data, researchers have performed the necessary steps by hand, downloading relevant datasets from existing repositories and writing one-time-use scripts for data integration, statistical analysis and prediction (for example, [8][9][10][11][12][13][14][15][16][17][18][19]). Such manual analyses are time-consuming to perform, difficult to reproduce and require bioinformatic skills that are beyond the reach of most biologists. Hence, these studies support demand for a software toolkit that facilitates statistical analysis and prediction of region-based genome and epigenome data.With the development of EpiGRAPH, we have pulled together our experiences and established workflows from several studies [10,[20][21][22][23] and incorporated them into a powerful and easy-to-use web service. In the remainder of this paper, we sketch the basic concepts of EpiGRAPH, demonstrate its practical use and utility in a case study on monoallelic gene expression, and outline how the UCSC Genome Browser [24],
The next generation sequencing technologies produce unprecedented amounts of data on the genetic sequence of individual organisms. These sequences carry a substantial amount of variation that may or may be not related to a phenotype. Phenotypically important part of this variation often comes in form of protein-sequence altering (non-synonymous) single nucleotide variants (nsSNVs). Here we present StructMAn, a Web-based tool for annotation of human and non-human nsSNVs in the structural context. StructMAn analyzes the spatial location of the amino acid residue corresponding to nsSNVs in the three-dimensional (3D) protein structure relative to other proteins, nucleic acids and low molecular-weight ligands. We make use of all experimentally available 3D structures of query proteins, and also, unlike other tools in the field, of structures of proteins with detectable sequence identity to them. This allows us to provide a structural context for around 20% of all nsSNVs in a typical human sequencing sample, for up to 60% of nsSNVs in genes related to human diseases and for around 35% of nsSNVs in a typical bacterial sample. Each nsSNV can be visualized and inspected by the user in the corresponding 3D structure of a protein or protein complex. The StructMAn server is available at http://structman.mpi-inf.mpg.de.
Finding drug combinations that increase the chances of therapeutic success is the main reason for using decision support systems. The present analysis of a large data set derived from clinical practice demonstrates that g2p-THEO solves this task significantly better than state-of-the-art expert-based systems. The tool is available at http://www.geno2pheno.org.
Identifying resistance to antiretroviral drugs is crucial for ensuring the successful treatment of patients infected with viruses such as human immunodeficiency virus (HIV) or hepatitis C virus (HCV). In contrast to Sanger sequencing, next-generation sequencing (NGS) can detect resistance mutations in minority populations. Thus, genotypic resistance testing based on NGS data can offer novel, treatment-relevant insights. Since existing web services for analyzing resistance in NGS samples are subject to long processing times and follow strictly rules-based approaches, we developed geno2pheno[ngs-freq], a web service for rapidly identifying drug resistance in HIV-1 and HCV samples. By relying on frequency files that provide the read counts of nucleotides or codons along a viral genome, the time-intensive step of processing raw NGS data is eliminated. Once a frequency file has been uploaded, consensus sequences are generated for a set of user-defined prevalence cutoffs, such that the constructed sequences contain only those nucleotides whose codon prevalence exceeds a given cutoff. After locally aligning the sequences to a set of references, resistance is predicted using the well-established approaches of geno2pheno[resistance] and geno2pheno[hcv]. geno2pheno[ngs-freq] can assist clinical decision making by enabling users to explore resistance in viral populations with different abundances and is freely available at http://ngs.geno2pheno.org.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.