Abstract:The susceptibility of complex diseases are characterised by numerous genetic, lifestyle, and environmental causes individually or due to their interaction effects. The recent explosion in detecting genetic interacting factors is increasingly revealing the underlying biological networks behind complex diseases. Several computational methods are explored to discover interacting polymorphisms among unlinked loci. However, there has been no significant breakthrough towards solving this problem because of biomolecular complexities and computational limitations. Our previous research trained a deep multilayered feedforward neural network to predict two-locus polymorphisms due to interactions in genome-wide data. The performance of the method was studied on numerous simulated datasets and a published genomewide dataset. In this manuscript, the performance of the trained multilayer neural network is validated by varying the parameters of the models under various scenarios. Furthermore, the observations of the previous method are confirmed in this study by evaluating on a real dataset. The experimental findings on a real dataset show significant rise in the prediction accuracy over other conventional techniques. The result shows highly ranked interacting two-locus polymorphisms, which may be associated with susceptibility for the development of breast cancer.
In this era of genome-wide association studies (GWAS), the quest for understanding the genetic architecture of complex diseases is rapidly increasing more than ever before. The development of high throughput genotyping and next generation sequencing technologies enables genetic epidemiological analysis of large scale data. These advances have led to the identification of a number of single nucleotide polymorphisms (SNPs) responsible for disease susceptibility. The interactions between SNPs associated with complex diseases are increasingly being explored in the current literature. These interaction studies are mathematically challenging and computationally complex. These challenges have been addressed by a number of data mining and machine learning approaches. This paper reviews the current methods and the related software packages to detect the SNP interactions that contribute to diseases. The issues that need to be considered when developing these models are addressed in this review. The paper also reviews the achievements in data simulation to evaluate the performance of these models. Further, it discusses the future of SNP interaction analysis.
The advancements in sequencing highthroughput human genome and computational abilities have tremendously improved the understanding of the genetic architecture behind the complex diseases. The development of high-throughput genotyping and nextgeneration sequencing technologies enables large-scale data for genetic epidemiological analysis. These advances led to the identification of a number of single nucleotide polymorphisms (SNPs) associated with complex diseases. The interactions between SNPs responsible for disease susceptibility have been increasingly explored in the current literature. These interaction studies are mathematically challenging and computationally complex. These challenges have been addressed by a number of data mining and machine learning approaches. The goal of this research is to implement associative classification and study its effectiveness for detecting the epistasis in balanced and imbalanced datasets. The proposed approach was evaluated for single-locus models to six-locus models using simulated data. The datasets were generated for five different penetrance functions by varying heritability, minor allele frequency and sample size. In total, 57,300 datasets were generated and several experiments conducted to identify the disease causal SNP interactions. The accuracy of classification by the proposed approach was compared with the existing approaches. The experimental results demonstrated significant improvements in accuracy for detecting interactions associated with the phenotype. Further, the approach was successfully applied over sporadic breast cancer data. The results show interaction among six polymorphisms, which included five different estrogen-metabolism genes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.