Objective: It is crucial to know the underlying causes of hepatocellular carcinoma (HCC) for optimal management. This study aims to classify open access gene expression data of HCC patients who have an HBV or HCV infection using the XGboost method.Material and Methods: This case-control study considered the open-access gene expression data of patients with HBV-related HCC and HCV-related HCC. For this purpose, data from 17 patients with HBV+HCC and 17 patients with HCV+HCC were included. XGboost was constructed for the classification via tenfold cross-validation. Accuracy, balanced accuracy, sensitivity, specificity, the positive predictive value, the negative predictive value, and F1 score performance metrics were evaluated for a model performance.Results: With the feature selection approach, 17 genes were chosen, and modeling was done using these input variables. Accuracy, balanced accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and the F1 score obtained from the XGboost model were 97.1%, 97.1%, 94.1%, 100%, 100%, 94.4%, and 97%, respectively. Based on the variable importance findings from the XGboost, the ALDOC, GLUD2, TRAPPC10, FLJ12998, RPL39, KDELR2, and KIAA0446 genes can be employed as potential biomarkers for HBV-related HCC.
Background/Aims: The aim of this study was to both classify data of familial adenomatous polyposis patients with and without duodenal cancer and to identify important genes that may be related to duodenal cancer by XGboost model. Materials and Methods: The current study was performed using expression profile data from a series of duodenal samples from familial adenomatous polyposis patients to explore variations in the familial adenomatous polyposis duodenal adenoma–carcinoma sequence. The expression profiles obtained from cancerous, adenomatous, and normal tissues of 12 familial adenomatous polyposis patients with duodenal cancer and the tissues of 12 familial adenomatous polyposis patients without duodenal cancer were compared. The ElasticNet approach was utilized for the feature selection. Using 5-fold cross-validation, one of the machine learning approaches, XGboost, was utilized to classify duodenal cancer. Accuracy, balanced accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score performance metrics were assessed for model performance. Results: According to the variable importance obtained from the modeling, ADH1C, DEFA5, CPS1, SPP1, DMBT1, VCAN-AS1, APOB genes (cancer vs. adenoma); LOC399753, APOA4, MIR548X, and ADH1C genes (adenoma vs. adenoma); SNORD123, CEACAM6, SNORD78, ANXA10, SPINK1, and CPS1 (normal vs. adenoma) genes can be used as predictive biomarkers. Conclusions: The proposed model used in this study shows that the aforementioned genes can forecast the risk of duodenal cancer in patients with familial adenomatous polyposis. More comprehensive analyses should be performed in the future to assess the reliability of the genes determined.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.