2020 International Conference on Inventive Computation Technologies (ICICT) 2020
DOI: 10.1109/icict48043.2020.9112406
|View full text |Cite
|
Sign up to set email alerts
|

Ensemble Gain Ratio Feature Selection (EGFS) Model with Machine Learning and Data Mining Algorithms for Disease Risk Prediction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 12 publications
0
5
0
Order By: Relevance
“…In Rao & Renuka (2020) , Naive Bayes and a Decision Tree, built using the ID3 algorithm, are used to perform a binary prediction about whether the patient is affected by hypo- or hyperthyroidism. The authors of the study in Pasha & Mohamed (2020) perform feature selection on the UCI dataset for thyroid disease detection by exploiting both a Random Forest-based method and a Gain Ratio technique. In the end, the prediction is performed by comparing also different machine learning techniques, namely K-Nearest-Neighbor, Logistic Regression, and Naive Bayes.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…In Rao & Renuka (2020) , Naive Bayes and a Decision Tree, built using the ID3 algorithm, are used to perform a binary prediction about whether the patient is affected by hypo- or hyperthyroidism. The authors of the study in Pasha & Mohamed (2020) perform feature selection on the UCI dataset for thyroid disease detection by exploiting both a Random Forest-based method and a Gain Ratio technique. In the end, the prediction is performed by comparing also different machine learning techniques, namely K-Nearest-Neighbor, Logistic Regression, and Naive Bayes.…”
Section: Resultsmentioning
confidence: 99%
“…Looking at the public datasets, we observed that the most used dataset is the UCI one ( https://archive.ics.uci.edu/ml/datasets/Thyroid+Disease ), exploited 27 times ( Duggal & Shukla, 2020 ; Shahid et al, 2019 ; Pan et al, 2016 ; Pavya & Srinivasan, 2017 ; Mahurkar & Gaikwad, 2017 ; Ahmed & Soomrani, 2016 ; Tyagi, Mehra & Saxena, 2018 ; Kumar, 2020 ; Pasha & Mohamed, 2020 ; Shen et al, 2016 ; Bentaiba-Lagrid et al, 2020 ; Raisinghani et al, 2019 ; Vivar et al, 2020 ; Li et al, 2019b ; Ma et al, 2018 ; Kour, Manhas & Sharma, 2020 ; Khan, 2021 ; Priyadharsini & Sasikala, 2022 ; Peya, Chumki & Zaman, 2021 ; Chaubey et al, 2021 ; Hosseinzadeh et al, 2021 ; Juneja, 2022 ; Kishor & Chakraborty, 2021 ; Islam et al, 2022 ; Saktheeswari & Balasubramanian, 2021 ; Chandel et al, 2016 ; Priya & Manavalan, 2018 ). The UCI dataset is characterized by 7,200 instances and 21 categorical and real attributes.…”
Section: Resultsmentioning
confidence: 99%
“…Examples of filter approaches are Gain Ratio (GR) and Information Gain (IG). GR is a technique to assess the reliability of dimensions by measuring the gain ratio [32]. The main workflow of IG is to choose a probability to evaluate the value in the data split by calculating the gain for each dimension.…”
Section: Feature Selectionmentioning
confidence: 99%
“…The main workflow of IG is to choose a probability to evaluate the value in the data split by calculating the gain for each dimension. Any dimension with the highest gain will be selected as the discriminant subgroup [32]. In addition, the wrapper approach is a method that selects important features by calculating weights and measuring accuracy in segmentation to create a new feature set by increasing or decreasing the number of features from the original dataset [33].…”
Section: Feature Selectionmentioning
confidence: 99%
“…The eight features subset was applied with the radial basis function (RBF) kernel-based SVM, and they achieved an accuracy of 85.82%. Javeed et al [18] designed an ensemble gain ratio feature selection (EGFS) model employing an RF as an ensemble algorithm and a gain ratio algorithm in order to extract features that aid in performance improvement. Via KNN, LR, and NB, they applied their model to medical datasets of UCI to attain an accurate disease risk prediction.…”
Section: A Feature Selection Techniques In Heart Disease Risk Predicmentioning
confidence: 99%