This paper presents a study of feature selection methods effect, using a filter approach, on the accuracy and error of supervised classification of cancer. A comparative evaluation between different selection methods: Fisher, T-Statistics, SNR and ReliefF, is carried out, using the dataset of different cancers; leukemia cancer, prostate cancer and colon cancer.
The classification results using k nearest neighbors (KNN) and support vector machine (SVM) classifiers show that the combination between SNR's method and the SVM classifier canpresent the highest accuracy.
In this paper we present a high accuracy computer-aided diagnosis scheme. The goal of the developed system is to classify benign and malignant microcalcifications on mammograms. It is mainly based on a combination of wavelet decomposition, feature extraction and classification methodology using Fisher's linear discriminant. The contribution of wavelet decomposition is to denoise and to enhance regions of interests (ROI) containing abnormalities. Feature extraction is performed using spatial grey level dependence (SGLD) matrices. The purpose of classification is to assign an object to a certain class. Many classification methods have been described. Here we use Fisher's linear discriminant. Fisher's linear discriminant is particularly useful for discriminating between two classes in a multidimensional space. Since it is based only on the first and second moments of each distribution, it is not a computationally intensive method. Our results show that the developed method is effective for quantifying the classification of benign and malignant microcalcifications abnormalities with an accuracy of 95.5%.
Feature selection involves identifying a subset of the most useful features that produce the same results as the original set of features. In this paper, we present a new approach for improving classification accuracy. This approach is based on quantum clustering for feature subset selection and wavelet transform for features extraction. The feature selection is performed in three steps. First the mammographic image undergoes a wavelet transform then some features are extracted. In the second step the original feature space is partitioned in clusters in order to group similar features. This operation is performed using the Quantum Clustering algorithm. The third step deals with the selection of a representative feature for each cluster. This selection is based on similarity measures such as the correlation coefficient (CC) and the mutual information (MI). The feature which maximizes this information (CC or MI) is chosen by the algorithm. This approach is applied for breast cancer classification. The K-nearest neighbors (KNN) classifier is used to achieve the classification. We have presented classification accuracy versus feature type, wavelet transform and K neighbors in the KNN classifier. An accuracy of 100% was reached in some cases.
A fundamental problem in machine learning is identifying the most representative subset of features from which we can construct a predictive model for a classification task. This paper aims to present a validation study of dimensionality reduction effect on the classification accuracy of mammographic images. The studied dimensionality reduction methods were: locality-preserving projection (LPP), locally linear embedding (LLE), Isometric Mapping (ISOMAP) and spectral regression (SR). We have achieved high rates of classifications. In some combinations the classification rate was 100%. But in most of the cases the classification rate is about 95%. It was also found that the classification rate increases with the size of the reduced space and the optimal value of space dimension is 60. We proceeded to validate the obtained results by measuring some validation indices such as: Xie-Beni index, Dun index and Alternative Dunindex. The measurement of these indices confirms that the optimal value of reduced space dimension is d=60.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.