Recently, feature selection and dimensionality reduction have become fundamental tools for many data mining tasks, especially for processing high-dimensional data such as gene expression microarray data. Gene expression microarray data comprises up to hundreds of thousands of features with relatively small sample size. Because learning algorithms usually do not work well with this kind of data, a challenge to reduce the data dimensionality arises. A huge number of gene selection are applied to select a subset of relevant features for model construction and to seek for better cancer classification performance. This paper presents the basic taxonomy of feature selection, and also reviews the state-of-the-art gene selection methods by grouping the literatures into three categories: supervised, unsupervised, and semi-supervised. The comparison of experimental results on top 5 representative gene expression datasets indicates that the classification accuracy of unsupervised and semi-supervised feature selection is competitive with supervised feature selection.
Segmentation of the breast region and pectoral muscle are fundamental subsequent steps in the process of Computer-Aided Diagnosis (CAD) systems. Segmenting the breast region and pectoral muscle are considered a difficult task, particularly in mammogram images because of artefacts, homogeneity among the region of the breast and pectoral muscle, and low contrast along the region of breast boundary, the similarity between the texture of the Region of Interest (ROI), and the unwanted region and irregular ROI. This study aims to propose an improved threshold-based and trainable segmentation model to derive ROI. A hybrid segmentation approach for the boundary of the breast region and pectoral muscle in mammogram images was established based on thresholding and Machine Learning (ML) techniques. For breast boundary estimation, the region of the breast was highlighted by eliminating bands of the wavelet transform. The initial breast boundary was determined through a new thresholding technique. Morphological operations and masking were employed to correct the overestimated boundary by deleting small objects. In the medical imaging field, significant progress to develop effective and accurate ML methods for the segmentation process. In the literature, the imperative role of ML methods in enabling effective and more accurate segmentation method has been highlighted. In this study, an ML technique was built based on the Histogram of Oriented Gradient (HOG) feature with neural network classifiers to determine the region of pectoral muscle and ROI. The proposed segmentation approach was tested by utilizing 322, 200, 100 mammogram images from mammographic image analysis society (mini-MIAS), INbreast, Breast Cancer Digital Repository (BCDR) databases, respectively. The experimental results were compared with manual segmentation based on different texture features. Moreover, evaluation and comparison for the boundary of the breast region and pectoral muscle segmentation have been done separately. The experimental results showed that the boundary of the breast region and the pectoral muscle segmentation approach obtained an accuracy of 98.13% and 98.41% (mini-MIAS), 100%, and 98.01% (INbreast), and 99.8% and 99.5% (BCDR), respectively. On average, the proposed study achieved 99.31% accuracy for the boundary of breast region segmentation and 98.64% accuracy for pectoral muscle segmentation. The overall ROI performance of the proposed method showed improving accuracy after improving the threshold technique for background segmentation and building an ML technique for pectoral muscle segmentation. More so, this paper also included the groundtruth as an evaluation of comprehensive similarity. In the clinic, this analysis may be provided as a valuable support for breast cancer identification.
Software testing is a vital and complex part of the software development life cycle. Optimization of software testing is still a major challenge, as prioritization of test cases remains unsatisfactory in terms of Average Percentage of Faults Detected (APFD) and time execution performance. This is attributed to a large search space to find an optimal ordering of test cases. In this paper, we have proposed an approach to prioritize test cases optimally using Firefly Algorithm. To optimize the ordering of test cases, we applied Firefly Algorithm with fitness function defined using a similarity distance model. Experiments were carried on three benchmark programs with test suites extracted from Software-artifact Infrastructure Repository (SIR). Our Test Case Prioritization (TCP) technique using Firefly Algorithm with similarity distance model demonstrated better if not equal in terms of APFD and time execution performance compared to existing works. Overall APFD results indicate that Firefly Algorithm is a promising competitor in TCP applications.
Collective improvement in the acceptable or desirable accuracy level of breast cancer image-related pattern recognition using various schemes remains challenging. Despite the combination of multiple schemes to achieve superior ultrasound image pattern recognition by reducing the speckle noise, an enhanced technique is not achieved. The purpose of this study is to introduce a features-based fusion scheme based on enhancement uniform-Local Binary Pattern (LBP) and filtered noise reduction. To surmount the above limitations and achieve the aim of the study, a new descriptor that enhances the LBP features based on the new threshold has been proposed. This paper proposes a multi-level fusion scheme for the auto-classification of the static ultrasound images of breast cancer, which was attained in two stages. First, several images were generated from a single image using the pre-processing method. The median and Wiener filters were utilized to lessen the speckle noise and enhance the ultrasound image texture. This strategy allowed the extraction of a powerful feature by reducing the overlap between the benign and malignant image classes. Second, the fusion mechanism allowed the production of diverse features from different filtered images. The feasibility of using the LBP-based texture feature to categorize the ultrasound images was demonstrated. The effectiveness of the proposed scheme is tested on 250 ultrasound images comprising 100 and 150 benign and malignant images, respectively. The proposed method achieved very high accuracy (98%), sensitivity (98%), and specificity (99%). As a result, the fusion process that can help achieve a powerful decision based on different features produced from different filtered images improved the results of the new descriptor of LBP features in terms of accuracy, sensitivity, and specificity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.