Cross-projects software defect prediction improves the quality of new software projects or projects with a shortage of historical data. Therefore, various data mining techniques are recommended in this field. The classification accuracy issue is considered one of the most significant problems due to the shortage and heterogeneous in historical data. To address this challenge, this research utilizes a spotted hyena optimizer algorithm as a classifier to predict defects through cross-projects. Confidence and Support are utilized as a multi-objective fitness function to look for the best classification rules. These classification rules are used to predict defects for new projects or other projects with insufficient data. The datasets of NASA such as JM1, KC1, and KC2 are used. By applying spotted hyena optimizer algorithm as a classifier on one dataset and predicting defects in the other two datasets, accuracy is reported 84.6, 92.0, 82.4, 90.7, 86.6 and 81.8 for JM1, KC1, and KC2 respectively. These accuracy values are better than the most significant data mining techniques in the field such as Support Vector Machine, Naïve Bayes, Boosting, C4.5, and Bagging. Also, the proposed research discusses other performance measures such as precision, recall, and f-measure. The conclusion proves that there are many features of McCabe and Halstead that have a strong impact to generate highly accurate predictors for defects such as McCabe's line count of code, McCabe's cyclomatic complexity, McCabe's essential complexity, McCabe's design complexity iv, Halstead's effort, Halstead's time estimator, Halstead's line count, Halstead's count of line of comments and total operators.
The database of NoSQL is considered one the most significant technology in the current era of computer science especially, with the emergency of big data. The issue of processing and storing data is solved by utilizing the NoSQL databases. Planning to offer references to the users of No SQL databases, this survey examines the characterization, categories, and hypothetical premise of NoSQL dependent on the introduction of the rise, improvement, and development of relational database to NoSQL and the examination of its restrictions of relational databases in the current era. Also, this survey points to a type of NoSQL database called graph database. It can be characterized as those in which the architecture for instances and schema are demonstrated as graphs and data control is described by graph oriented processes and activities and type constructors. It started in eighties and nineties close by object arranged models. Their impact slowly ceased to exist with the rise of other models, specifically XML, spatial, and semi-structured. Lately, the requirement to manipulate information with graph like nature has restored the relationship of this field. The fundamental goal of this review is to introduce the work that has been proposed in the field of graph database.
Software engineering companies strive to improve software quality by predicting software defects-prone modules. Although various data mining methods have been developed, unstable accuracy rates are still critical issues owing to the imbalanced nature and high dimensionality of software defect datasets. To deal with this issue, we propose a spotted hyena, a novel meta-heuristic optimization algorithm for predicting software defects. Support and confidence in classification rules are the basis of a multi-objective fitness function that assists the spotted hyena algorithm in serving as a classifier by finding the fittest classification or standard rules among individuals. Experiments were conducted on four NASA software datasets, JM1, KC2, KC1, and PC3. The spotted hyena classifier provides an accuracy of 85.2, 84, 89.6, and 81.8%, respectively, for these datasets. These accuracy rates are better than those achieved using other popular data mining techniques. We also discuss other classification measures in connection with the experimental results, such as precision, recall, and confusion matrices, in connection with the experimental results. Moreover, the Gaussian mixture model is used to study the uncertainty quantification of the proposed classifier. The study proved the feasible performance of the spotted hyena classifier in four different case studies.
In this article, affiliation no. 1 incorrectly showed as the last author's first affiliation. The original article has been corrected.Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.