2011
DOI: 10.1007/s10916-011-9730-1
|View full text |Cite
|
Sign up to set email alerts
|

Enhanced Cancer Recognition System Based on Random Forests Feature Elimination Algorithm

Abstract: Accurate classifiers are vital to design precise computer aided diagnosis (CADx) systems. Classification performances of machine learning algorithms are sensitive to the characteristics of data. In this aspect, determining the relevant and discriminative features is a key step to improve performance of CADx. There are various feature extraction methods in the literature. However, there is no universal variable selection algorithm that performs well in every data analysis scheme. Random Forests (RF), an ensembl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
5
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 17 publications
(6 citation statements)
references
References 23 publications
1
5
0
Order By: Relevance
“…, } in the training set is obtained through a set of pre-processing steps, i.e., tokenization, normalization, stop-word removal, stemming and removal of less-frequent words [7,10]. In this section, we briefly provide the general workflow of these steps with the use of a sample short-text from [11]: "The success of RF algorithm makes it eligible to be used as kernel of a wrapper feature subset evaluator. We used best first search RF wrapper algorithm to select optimal features of four medical datasets: colon cancer, leukemia cancer, breast cancer and lung cancer.…”
Section: Feature Representation and Pre-processingmentioning
confidence: 99%
“…, } in the training set is obtained through a set of pre-processing steps, i.e., tokenization, normalization, stop-word removal, stemming and removal of less-frequent words [7,10]. In this section, we briefly provide the general workflow of these steps with the use of a sample short-text from [11]: "The success of RF algorithm makes it eligible to be used as kernel of a wrapper feature subset evaluator. We used best first search RF wrapper algorithm to select optimal features of four medical datasets: colon cancer, leukemia cancer, breast cancer and lung cancer.…”
Section: Feature Representation and Pre-processingmentioning
confidence: 99%
“…Fan et al [8] presented a model which is based on hybrid reasoning and fuzzy decision tree (BFDT) for detection of liver disease with an accuracy of 81.6% which is highest among various other models. Ozcift [9] used best first search random forest algorithm and found classification accuracy of 98.9%. Nguyen et al [10] [13].…”
Section: Literature Reviewmentioning
confidence: 99%
“…Surprisingly, Markov blanket (MB) attribute selection was only discovered three times in the search for relevant publications although, it is a very useful hybrid approach for attribute selection and classification and is used in this dissertation to study attribute selection in connection with early DRG classification of inpatients. [6, 7, 14, 17, 28, 39, 48, 50, 58, 72, 90, 97, 100, 109, 124, 129, 133, 139, 150, 164, 178, 187, 189-192, 202, 208, 210, 211, 215, 217, 220, 230, 237, 240, 245] Correlation-based [5,66,71,73,88,148] Information gain [5,9,16,66,68,91,94,99,104,128,148] Markov blanket [16,88,148] Other attribute selection or evaluation techniques [5, 9, 11, 12, 15, 19, 25, 37, 38, 40, 41, 47, 49, 52, 56, 59-61, 66-68,71,73-75,85,88,89,94,96,98,108,110,118,122,123,128, 130, 132, 141-143, 146, 155-158, 168-170, 179, 180, 182, 183, 188, 209, 212, 221, 223, 225, 226, 232, 234-236, 241, 244, 246] Principal component [11, 119-121, 143, 213, 219] Relief algorithms [40,41,66,148] Wrapper [40,41,43,66,71,88,91,126,127,148,153,159,…”
Section: Selection Criteria and Search For Relevant Literaturementioning
confidence: 99%
“…They evaluate a sampling technique described by Robnik-Šikonja and Kononenko [184] and compare it with further attribute selection techniques using standard data sets. The attribute selection techniques are benchmarked [11, 16, 19, 37, 43, 49, 50, 58, 73, 85, 88-90, 97, 99, 110, 118, 119, 121, 122, 124, 127, 129, 132, 133, 155, 169, 171, 209, 215, 219, 220, 223, 244] Bayesian networks [6,16,66,75,88,122,126,133,139,150,153,210,211,246] Combined classification [6,15,28,37,39,43,52,66,74,88,89,94,97,100,109,148,153,158,171,213,220,223,226] Decision rules [6,25,52,56,61,66,91,…”
Section: Classification Techniquesmentioning
confidence: 99%