Motivation, BRCA1 and BRCA2 are genes with tumor suppressor activity. They are involved in a considerable number of biological processes. To help the biologist in tumor classification, we developed a deep learning algorithm. The question when we want to construct a neural network is how many hidden layers and neurons should we use. If the number of inputs and outputs is defined by the problem, the number of hidden layers and neurons is difficult to define. Hidden layers and neurons that make up each layer of the neural network influence the performance of system predictions. There are different methods for finding the optimal architecture. In this paper, we present the two packages that we have developed, the genetic algorithm (GA) and the particle swarm optimization (PSO) to optimize the parameters of the neural network for predicting BRCA1 and BRCA2 pathogenicity; Results, we will compare the results obtained by the two algorithms. We used datasets collected from our NGS analysis of BRCA1 and BRCA2 genes to train deep learning models. It represents a data collection of 11,875 BRCA1 and BRCA2 variants. Our preliminary results show that the PSO provided the most significant architecture of hidden layers and the number of neurons compared to grid search and GA; Conclusions, the optimal architecture found by the PSO algorithm is composed of 6 hidden layers with 275 hidden nodes with an accuracy of 0.98, precision 0.99, recall 0.98, and a specificity of 0.99.
BRCA1 and BRCA2 are genes with tumor suppressor activity, and they are involved ina considerable number of biological processes allowing the regulation of the cellreplication cycle. A mutation in one of these two genes has a significant probability ofcausing cancer. We have set up within the platform a machine learning algorithm basedon the random forest to predict pathogenicity in colorectal, melanoma, lung, and gliomacancer. but this algorithm has revealed its limits when we want to predict on morecomplex genes like BRCA1 and BRCA2. To help the biologist in the classification oftumors, we decided to develop a deep learning algorithm.The question we ask ourselves when we want to construct a neural network is howmany hidden layers and neurons should we use. If the number of inputs and outputs isdefined by the problem that we require to resolve, the number of hidden layers andneurons is difficult to define because there is no pre-established rule. The number ofhidden layers and neurons that make up each layer of the neural network has aninfluence on the performance of system predictions. There are different methods forfinding the optimal architecture like grid search or based on empirical equations. Allthese techniques can be very time-consuming. In this paper, we will present the twopackages that we have developed, the genetic algorithm (GA) and the particle swarmoptimization (PSO) to optimize the parameters of the neural network for the predictionof the pathogenicity of the BRCA1 and BRCA2 genes. We will compare the resultsobtained by the two algorithms. We used datasets collected from our NGS analysis ofBRCA1 and BRCA2 genes to train deep learning models. This represents a datacollection of 11,875 BRCA1 and BRCA2 variants (BRCA1 benign 2,632, BRCA1pathogenic 2,660, BRCA2 benign 3,446, BRCA2 pathogenic 3,137). Our preliminaryresults show that the PSO provided the most significant architecture in terms of hiddenlayers and the number of neurons compared to grid search and GA. The optimalarchitecture found by the PSO algorithm is composed of 6 hidden layers with 275 hiddennodes with an accuracy of 0.98, precision 0.99, recall 0.98, and a specificity of 0.99.
The advent of next-generation sequencing (NGS) technologies has revolutionized the field of bioinformatics and genomics, particularly in the area of onco-somatic genetics. NGS has provided a wealth of information about the genetic changes that underlie cancer and has considerably improved our ability to diagnose and treat cancer. However, the large amount of data generated by NGS makes it difficult to interpret the variants. To address this, machine learning algorithms such as Extreme Gradient Boosting (XGBoost) have become increasingly important tools in the analysis of NGS data. In this paper, we present a machine learning tool that uses XGBoost to predict the pathogenicity of a mutation in the myeloid panel. We optimized the performance of XGBoost using metaheuristic algorithms and compared our predictions with the decisions of biologists and other prediction tools. The myeloid panel is a critical component in the diagnosis and treatment of myeloid neoplasms, and the sequencing of this panel allows for the identification of specific genetic mutations, enabling more accurate diagnoses and tailored treatment plans. We used datasets collected from our myeloid panel NGS analysis to train the XGBoost algorithm. It represents a data collection of 15,977 mutations variants composed of a collection of 13,221 Single Nucleotide Variants (SNVs), 73 Multiple Nucleoid Variants (MNVs), and 2683 insertion deletions (INDELs). The optimal XGBoost hyperparameters were found with Differential Evolution (DE), with an accuracy of 99.35%, precision of 98.70%, specificity of 98.71%, and sensitivity of 1.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright 漏 2024 scite LLC. All rights reserved.
Made with 馃挋 for researchers
Part of the Research Solutions Family.