Deciphering gene–disease association is a crucial step in designing therapeutic strategies against diseases. There are experimental methods for identifying gene–disease associations, such as genome-wide association studies and linkage analysis, but these can be expensive and time consuming. As a result, various in silico methods for predicting associations from these and other data have been developed using different approaches. In this article, we review some of the recent approaches to the computational prediction of gene–disease association. We look at recent advancements in algorithms, categorising them into those based on genome variation, networks, text mining, and crowdsourcing. We also look at some of the challenges faced in the computational prediction of gene–disease associations.
Sickle cell disease (SCD) is a debilitating single gene disorder caused by a single point mutation that results in physical deformation (i.e. sickling) of erythrocytes at reduced oxygen tensions. Up to 75% of SCD in newborns world-wide occurs in sub-Saharan Africa, where neonatal and childhood mortality from sickle cell related complications is high. While SCD research across the globe is tackling the disease on multiple fronts, advances have yet to significantly impact on the health and quality of life of SCD patients, due to lack of coordination of these disparate efforts. Ensuring data across studies is directly comparable through standardization is a necessary step towards realizing this goal. Such a standardization requires the development and implementation of a disease-specific ontology for SCD that is applicable globally. Ontology development is best achieved by bringing together experts in the domain to contribute their knowledge.The SCD community and H3ABioNet members joined forces at a recent SCD Ontology workshop to develop an ontology covering aspects of SCD under the classes: phenotype, diagnostics, therapeutics, quality of life, disease modifiers and disease stage. The aim of the workshop was for participants to contribute their expertise to development of the structure and contents of the SCD ontology. Here we describe the proceedings of the Sickle Cell Disease Ontology Workshop held in Cape Town South Africa in February 2016 and its outcomes. The objective of the workshop was to bring together experts in SCD from around the world to contribute their expertise to the development of various aspects of the SCD ontology.
Following the central dogma of molecular biology, where data flows from gene to protein through transcript, information on gene expression provides information on the functional state of an organism. Microarray technology arose to measure the expression level of thousands of genes simultaneously. These vast amounts of data generated at all levels of biological organization help to identify co-expressed genes, which may reveal proteins interacting in a complex or acting in the same pathway without direct physical contact. Discovering associations of regulatory patterns of characterized proteins with those of hypothetical proteins may identify functional relationships between them and facilitate the characterization of proteins of unknown function. Here we make use of the random partial least squares regression technique (r-PLS) to trace connections between co-expressed genes in Mycobacterium tuberculosis using data downloaded from public microarray databases. We generated the overall topology of a microbial co-expression network with the exact complexity of the model. This approach provides a general method for generating a co-expression network of an organism for the purpose of systems-level analyses.
Advances in high-throughput sequencing technologies have resulted in an exponential growth of publicly accessible biological datasets. In the ‘big data’ driven ‘post-genomic’ context, much work is being done to explore human protein–protein interactions (PPIs) for a systems level based analysis to uncover useful signals and gain more insights to advance current knowledge and answer specific biological and health questions. These PPIs are experimentally or computationally predicted, stored in different online databases and some of PPI resources are updated regularly. As with many biological datasets, such regular updates continuously render older PPI datasets potentially outdated. Moreover, while many of these interactions are shared between these online resources, each resource includes its own identified PPIs and none of these databases exhaustively contains all existing human PPI maps. In this context, it is essential to enable the integration of or combining interaction datasets from different resources, to generate a PPI map with increased coverage and confidence. To allow researchers to produce an integrated human PPI datasets in real-time, we introduce the integrated human protein–protein interaction network generator (IHP-PING) tool. IHP-PING is a flexible python package which generates a human PPI network from freely available online resources. This tool extracts and integrates heterogeneous PPI datasets to generate a unified PPI network, which is stored locally for further applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.