Two New Metrics for Feature Selection in Pattern Recognition

Piñero, Pedro Y.; Arco, Leticia; García, María M.; Caballero, Yailé; Yzquierdo, Raykenler; Morales, Alfredo

doi:10.1007/978-3-540-24586-5_60

Cited by 7 publications

(4 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To further validate the developed algorithms, we compared the classification results from this investigation with classic feature selection methods such as SVM-RFE (SVM-Recursive Feature Elimination) [34], ARCO ((Area Under the Curve (AUC) and Rank Correlation coefficient Optimization) [35], Relief [36] and mRMR (minimal redundancymaximal-relevance) [37] using our data. The mRMR method recorded the highest classification when the number of features/genes was 32, which recorded an accuracy of 83%.…”

Section: Comparative Evaluation and Validation Of Svm Resultsmentioning

confidence: 99%

“…For evaluation and comparison of the classification and misclassification performance of the four ML algorithms, we used 4 different scenarios in which any sample could end up or fall into: (a) true positive (TP) which means the sample was predicted as TNBC and was the correct prediction; (b) true negative (TN) which means the sample was predicted as non-TNBC and this was the correct prediction; (c) false positive (FP) which means the sample was predicted as TNBC, but was non-TNBC, and (d) false negative (FN) which means the sample was predicted as non-TNBC, but was TNBC. Using this information, we evaluated the classification results of the model by calculating the overall accuracy, To further validate the methods, the classification results were also compared with classic feature selection methods such as SVM-RFE [34], ARCO [35], Relief [36] and mRMR [37]. The SVM-REF relies on constructing feature ranking coefficients based on the weight vector generated by SVM during training.…”

Section: Modeling Prediction and Performance Evaluationmentioning

confidence: 99%

See 1 more Smart Citation

Breast Cancer Type Classification Using Machine Learning

Hicks

2021

JPM

128

View full text Add to dashboard Cite

Background: Breast cancer is a heterogeneous disease defined by molecular types and subtypes. Advances in genomic research have enabled use of precision medicine in clinical management of breast cancer. A critical unmet medical need is distinguishing triple negative breast cancer, the most aggressive and lethal form of breast cancer, from non-triple negative breast cancer. Here we propose use of a machine learning (ML) approach for classification of triple negative breast cancer and non-triple negative breast cancer patients using gene expression data. Methods: We performed analysis of RNA-Sequence data from 110 triple negative and 992 non-triple negative breast cancer tumor samples from The Cancer Genome Atlas to select the features (genes) used in the development and validation of the classification models. We evaluated four different classification models including Support Vector Machines, K-nearest neighbor, Naïve Bayes and Decision tree using features selected at different threshold levels to train the models for classifying the two types of breast cancer. For performance evaluation and validation, the proposed methods were applied to independent gene expression datasets. Results: Among the four ML algorithms evaluated, the Support Vector Machine algorithm was able to classify breast cancer more accurately into triple negative and non-triple negative breast cancer and had less misclassification errors than the other three algorithms evaluated. Conclusions: The prediction results show that ML algorithms are efficient and can be used for classification of breast cancer into triple negative and non-triple negative breast cancer types.

show abstract

Section: Comparative Evaluation and Validation Of Svm Resultsmentioning

confidence: 99%

Section: Modeling Prediction and Performance Evaluationmentioning

confidence: 99%

Breast Cancer Type Classification Using Machine Learning

Hicks

2021

JPM

128

View full text Add to dashboard Cite

show abstract

“…In this algorithm we use the terms R(A) and H(A) proposed in [20]. R(A) lies within [0,1] and stands for the relative importance of attribute A while H(A) represents heuristic information about a subset of candidate features.…”

Section: A Greedy Algorithm To Feature Selectionmentioning

confidence: 99%

Rough Sets and Evolutionary Computation to Solve the Feature Selection Problem

Bello

Gómez

Caballero

et al.

Studies in Computational Intelligence

View full text Add to dashboard Cite

Summary. The feature selection problem has been usually addressed through heuristic approaches given its significant computational complexity. In this context, evolutionary techniques have drawn the researchers' attention owing to their appealing optimization capabilities. In this chapter, promising results achieved by the authors in solving the feature selection problem through a joint effort between rough set theory and evolutionary computation techniques are reviewed. In particular, two new heuristic search algorithms are introduced, i.e. Dynamic Mesh Optimization and another approach which splits the search process carried out by swarm intelligence methods.

show abstract

“…In this algorithm we use the terms R(A) and H(A) proposed in [Piñ03]. The expression for R(A) which is a relevant measure of the attributes (0≤R(A)≤ 1) is:…”

Section: Feature Selection By Using An Evolutionary Approachmentioning

confidence: 99%

Two new feature selection algorithms with Rough Sets Theory

Caballero

Bello

Alvarez

et al.

IFIP International Federation for Information Processing

View full text Add to dashboard Cite

Rough Sets Theory has opened new trends for the development of the Incomplete Information Theory. Inside this one, the notion of reduct is a very significant one, but to obtain a reduct in a decision system is an expensive computing process although very important in data analysis and knowledge discovery. Because of this, it has been necessary the development of different variants to calculate reducts. The present work look into the utility that offers Rough Sets Model and Information Theory in feature selection and a new method is presented with the purpose of calculate a good reduct. This new method consists of a greedy algorithm that uses heuristics to work out a good reduct in acceptable times. In this paper we propose other method to find good reducts, this method combines elements of Genetic Algorithm with Estimation of Distribution Algorithms. The new methods are compared with others which are implemented inside Pattern Recognition and Ant Colony Optimization Algorithms and the results of the statistical tests are shown. IntroductionFeature selection is an important task inside Machine Learning. It consists of focusing on the most relevant features for use in representing data in order to delete those features considered as irrelevant and that make more difficult a knowledge discovery process inside a database. Feature subset selection represents the problem of finding an optimal subset of features (attributes) of a database according to some criterion, so that a classifier with the highest possible accuracy can be generated by an inductive learning algorithm that is run on data containing only the subset of features However, this beneficial alternative is limited because of the computational complexity of calculating reducts. [Bel98] shows that the computational cost of finding a reduct in the information system that is limited by l 2 m 2 , where l is the length of the attributes and m is the amount of objects in the universe of the information system; while the complexity in time of finding all the reducts of information system is O(2

show abstract

Two New Metrics for Feature Selection in Pattern Recognition

Cited by 7 publications

References 12 publications

Breast Cancer Type Classification Using Machine Learning

Breast Cancer Type Classification Using Machine Learning

Rough Sets and Evolutionary Computation to Solve the Feature Selection Problem

Two new feature selection algorithms with Rough Sets Theory

Contact Info

Product

Resources

About