Impact of feature selection on classification via clustering techniques in software defect prediction

Usman-Hamza, Fatimah E.; Atte, A.F.; Balogun, Abdullateef Oluwagbemiga; Mojeed, Hammed A.; Bajeh, Amos Orenyi; Adeyemo, Victor Elijah

doi:10.4314/jcsia.v26i1.8

Cited by 8 publications

(5 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Exploratory analysis and dimensionality reduction are two typical unsupervised ML applications. Unsupervised ML approaches may be utilized to acquire first insights into data in situations when a human examination is difficult [ 131 ]. The findings may be used to test various theories.…”

Section: Artificial Intelligencementioning

confidence: 99%

Recent Advancements in Emerging Technologies for Healthcare Management Systems: A Survey

et al. 2022

View full text Add to dashboard Cite

In recent times, the growth of the Internet of Things (IoT), artificial intelligence (AI), and Blockchain technologies have quickly gained pace as a new study niche in numerous collegiate and industrial sectors, notably in the healthcare sector. Recent advancements in healthcare delivery have given many patients access to advanced personalized healthcare, which has improved their well-being. The subsequent phase in healthcare is to seamlessly consolidate these emerging technologies such as IoT-assisted wearable sensor devices, AI, and Blockchain collectively. Surprisingly, owing to the rapid use of smart wearable sensors, IoT and AI-enabled technology are shifting healthcare from a conventional hub-based system to a more personalized healthcare management system (HMS). However, implementing smart sensors, advanced IoT, AI, and Blockchain technologies synchronously in HMS remains a significant challenge. Prominent and reoccurring issues such as scarcity of cost-effective and accurate smart medical sensors, unstandardized IoT system architectures, heterogeneity of connected wearable devices, the multidimensionality of data generated, and high demand for interoperability are vivid problems affecting the advancement of HMS. Hence, this survey paper presents a detailed evaluation of the application of these emerging technologies (Smart Sensor, IoT, AI, Blockchain) in HMS to better understand the progress thus far. Specifically, current studies and findings on the deployment of these emerging technologies in healthcare are investigated, as well as key enabling factors, noteworthy use cases, and successful deployments. This survey also examined essential issues that are frequently encountered by IoT-assisted wearable sensor systems, AI, and Blockchain, as well as the critical concerns that must be addressed to enhance the application of these emerging technologies in the HMS.

show abstract

Section: Artificial Intelligencementioning

confidence: 99%

Recent Advancements in Emerging Technologies for Healthcare Management Systems: A Survey

et al. 2022

View full text Add to dashboard Cite

show abstract

“…In addition, Marjuni, Adji [46] developed an unsupervised ML technique named signed Laplacianbased spectral classifier for SDP. The unsupervised ML technique is applied when working on unlabelled datasets, unlike supervised ML [14]. However, the performance of an ML technique depends largely on the quality of the datasets used for training such an ML technique [47][48][49].…”

Section: Related Workmentioning

confidence: 99%

“…In particular, SDP deploys ML methods on software features that are defined by software metrics to contain defects in software modules or components [7][8][9]. Several studies have proposed and implemented both supervised and unsupervised forms of ML methods for SDP [10][11][12][13][14][15]. Nevertheless, the predictive performance of SDP models is flatly dependent on the quality and inherent nature of the software datasets used for developing such SDP models.…”

Section: Introductionmentioning

confidence: 99%

An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction

Balogun

Basri

Capretz

et al. 2021

Entropy

View full text Add to dashboard Cite

Feature selection is known to be an applicable solution to address the problem of high dimensionality in software defect prediction (SDP). However, choosing an appropriate filter feature selection (FFS) method that will generate and guarantee optimal features in SDP is an open research issue, known as the filter rank selection problem. As a solution, the combination of multiple filter methods can alleviate the filter rank selection problem. In this study, a novel adaptive rank aggregation-based ensemble multi-filter feature selection (AREMFFS) method is proposed to resolve high dimensionality and filter rank selection problems in SDP. Specifically, the proposed AREMFFS method is based on assessing and combining the strengths of individual FFS methods by aggregating multiple rank lists in the generation and subsequent selection of top-ranked features to be used in the SDP process. The efficacy of the proposed AREMFFS method is evaluated with decision tree (DT) and naïve Bayes (NB) models on defect datasets from different repositories with diverse defect granularities. Findings from the experimental results indicated the superiority of AREMFFS over other baseline FFS methods that were evaluated, existing rank aggregation based multi-filter FS methods, and variants of AREMFFS as developed in this study. That is, the proposed AREMFFS method not only had a superior effect on prediction performances of SDP models but also outperformed baseline FS methods and existing rank aggregation based multi-filter FS methods. Therefore, this study recommends the combination of multiple FFS methods to utilize the strength of respective FFS methods and take advantage of filter–filter relationships in selecting optimal features for SDP processes.

show abstract

“…Specifically, SDP is the application of ML techniques on software defect datasets which are characterized by software metrics (as features) to ascertain defects in software modules or components [8][9][10]. From studies, both supervised and unsupervised types of ML techniques have been proposed and implemented for SDP [11][12][13][14][15][16]. However, the prediction performance of SDP models categorically depends on the nature (quality) and characteristics of software datasets used in developing SDP models.…”

Section: Introductionmentioning

confidence: 99%

Empirical Analysis of Rank Aggregation-Based Multi-Filter Feature Selection Methods in Software Defect Prediction

et al. 2021

View full text Add to dashboard Cite

Selecting the most suitable filter method that will produce a subset of features with the best performance remains an open problem that is known as filter rank selection problem. A viable solution to this problem is to independently apply a mixture of filter methods and evaluate the results. This study proposes novel rank aggregation-based multi-filter feature selection (FS) methods to address high dimensionality and filter rank selection problem in software defect prediction (SDP). The proposed methods combine rank lists generated by individual filter methods using rank aggregation mechanisms into a single aggregated rank list. The proposed methods aim to resolve the filter selection problem by using multiple filter methods of diverse computational characteristics to produce a dis-joint and complete feature rank list superior to individual filter rank methods. The effectiveness of the proposed method was evaluated with Decision Tree (DT) and Naïve Bayes (NB) models on defect datasets from NASA repository. From the experimental results, the proposed methods had a superior impact (positive) on prediction performances of NB and DT models than other experimented FS methods. This makes the combination of filter rank methods a viable solution to filter rank selection problem and enhancement of prediction models in SDP.

show abstract

Impact of feature selection on classification via clustering techniques in software defect prediction

Cited by 8 publications

References 0 publications

Recent Advancements in Emerging Technologies for Healthcare Management Systems: A Survey

Recent Advancements in Emerging Technologies for Healthcare Management Systems: A Survey

An Adaptive Rank Aggregation-Based Ensemble Multi-Filter Feature Selection Method in Software Defect Prediction

Empirical Analysis of Rank Aggregation-Based Multi-Filter Feature Selection Methods in Software Defect Prediction

Contact Info

Product

Resources

About