2019
DOI: 10.1038/s41598-019-48769-y
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study

Abstract: A comprehensive screening method using machine learning and many factors (biological characteristics, Helicobacter pylori infection status, endoscopic findings and blood test results), accumulated daily as data in hospitals, could improve the accuracy of screening to classify patients at high or low risk of developing gastric cancer. We used XGBoost, a classification method known for achieving numerous winning solutions in data analysis competitions, to capture nonlinear relations among … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
51
0
15

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 93 publications
(67 citation statements)
references
References 34 publications
1
51
0
15
Order By: Relevance
“…When using the data, it was randomly divided into two parts, with 80% used for developing window behavior models and 20% for model validation. This division has been popularly adopted in existing studies [45,46,49,51,61]. Table 3 has listed the calculated Pearson Correlation Coefficient for each potential influential factor considered in this study.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…When using the data, it was randomly divided into two parts, with 80% used for developing window behavior models and 20% for model validation. This division has been popularly adopted in existing studies [45,46,49,51,61]. Table 3 has listed the calculated Pearson Correlation Coefficient for each potential influential factor considered in this study.…”
Section: Resultsmentioning
confidence: 99%
“…As a new machine learning method, the XGBoost (eXtreme Gradient Boost) method was firstly introduced by Chen [37] in 2016, and has been used in many other applications, such as automotive manufacturing [38], predicting building cooling load [39] and fault detection for HVAC systems [40]. In existing studies, much evidence was available about its advantages (stability, accuracy and efficiency) in modeling complex process over other conventional machine learning methods, such as SVM algorithm [41,42], logistic regression method [43][44][45][46][47][48][49] and KNN/decision tree [50,51]. This study, therefore, was designed to justify its contribution to modeling accuracy of occupant window behavior in buildings, mainly against the most conventional modeling approach, i.e.…”
Section: Introductionmentioning
confidence: 99%
“…Over recent years, a variety of approaches for predicting radiation late effects have been developed 10-20 , albeit with varying degrees of compromise between cost-effectiveness, throughput, and predictive power. One notable and extremely promising exception is the use of ML models, which can leverage extensive amounts of patient data to make accurate predictions of treatment outcomes [58][59][60][62][63][64] .…”
Section: Discussionmentioning
confidence: 99%
“…We sought an alternative approach that could effectively utilize our vast dataset of pre-IMRT individual telomere length measurements (n=128,800), and also capture the nonlinearity of telomeric responses. Considering that XGBoost had recently been used to predict cancer risk and radiationinduced fibrosis using patient data [61][62][63][64] , we hypothesized that XGBoost models could be trained with pre-IMRT individual telomere length measurements to accurately predict post-IMRT telomeric outcomes.…”
Section: Development Of Xgboost Machine Learning Models For Accurate mentioning
confidence: 99%
“…As a symbolic case in this category, health examination data used to predict the risk of stomach cancer included a customized dataset that consisted of biological characteristics, infection conditions of Helicobacter pylori, endoscopic diagnosis, and blood testing in order to conduct XGboost learning. This resulted in the successful prediction of disease onset with an accuracy of AUC 0.899 [29]. In another case study, the use of machine learning based on individual treatment histories as presented by digitalized medical insurance receipt data, predicted the onset of Alzheimer's disease with an AUC of 0.730.…”
Section: Advanced Preventive Medicine Using Health Datamentioning
confidence: 99%