Cancer is one of the leading causes of death in many countries, and this continues to be the case because of the lack of sufficient treatment. One of the most common types is non-small-cell lung cancer (NSCLC). The increasingly large and diverse public datasets about NSCLC constitute a rich source of data on which several analyses can be performed so as to find candidate oncogenic drivers or therapeutic targets. The aim of this study is to reanalyze an existing NSCLC NCBI GEO Dataset (accession = GSE19804) in order to see if novel involved genes can be found. For this, we used microarray technology for preprocessing and, based on random forest, support vector machine and C5.0 decision tree models, made a comparison of the 10 most important genes recorded. This study was realized with R-Studio 4.0.2 and Bioconductor 3.11. In conclusion, the EFNA4 gene and other genes, namely KANK3, GRK5, CLIC5, SH3GL3, ACACB, LIN7A, JCAD, and NEDD1, are thought to be potential genes that may play a role in NSCLC and it is recommended that researchers working in the wet laboratory should focus on these genes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.