Method-Level Code Smells Detection Using Machine Learning Models

Dewangan, Seema; Rao, Rajwant Singh

doi:10.1007/978-981-99-3734-9_7

Lecture Notes in Networks and Systems

2023

DOI: 10.1007/978-981-99-3734-9_7

|View full text |Cite

Method-Level Code Smells Detection Using Machine Learning Models

Seema Dewangan,

Rajwant Singh Rao

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

2025

Publication Types

Select...

Article3

Relationship

Self Cite0

Independent3

Authors

Journals

Cited by 3 publications

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Exploring the role of project status information in effective code smell detection

Alkharabsheh,

Alawadi,

Crespo

et al. 2024

Cluster Comput

View full text Add to dashboard Cite

Repairing code smells detected in the code or design of the system is one of the activities contributing to increasing the software quality. In this study, we investigate the impact of non-numerical information of software, such as project status information combined with machine learning techniques, on improving code smell detection. For this purpose, we constructed a dataset consisting of 22 systems with various project statuses, 12,040 classes, and 18 features that included 1935 large classes. A set of experiments was conducted with ten different machine learning techniques by dividing the dataset into training, validation, and testing sets to detect the large class code smell. Feature selection and data balancing techniques have been applied. The classifier’s performance was evaluated using six indicators: precision, recall, F-measure, MCC, ROC area, and Kappa tests. The preliminary experimental results reveal that feature selection and data balancing have poor influence on the accuracy of machine learning classifiers. Moreover, they vary their behavior when utilized in sets with different values for the selected project status information of their classes. The average value of classifiers performance when fed with status information is better than without. The Random Forest achieved the best behavior according to all performance indicators (100%) with status information, while AdaBoostM1 and SMO achieved the worst in most of them (> 86%). According to the findings of this study, providing machine learning techniques with project status information about the classes to be analyzed can improve the results of large class detection.

show abstract

Exploring the role of project status information in effective code smell detection

Alkharabsheh,

Alawadi,

Crespo

et al. 2024

Cluster Comput

View full text Add to dashboard Cite

show abstract

Ensemble methods with feature selection and data balancing for improved code smells classification performance

Yadav,

Rao,

Mishra

et al. 2025

Engineering Applications of Artificial Intelligence

View full text Add to dashboard Cite

Machine Learning-Based Methods for Code Smell Detection: A Survey

Yadav,

Rao,

Mishra

et al. 2024

Applied Sciences

View full text Add to dashboard Cite

Code smells are early warning signs of potential issues in software quality. Various techniques are used in code smell detection, including the Bayesian approach, rule-based automatic antipattern detection, antipattern identification utilizing B-splines, Support Vector Machine direct, SMURF (Support Vector Machines for design smell detection using relevant feedback), and immune-based detection strategy. Machine learning (ML) has taken a great stride in this area. This study includes relevant studies applying ML algorithms from 2005 to 2024 in a comprehensive manner for the survey to provide insight regarding code smell, ML algorithms frequently applied, and software metrics. Forty-two pertinent studies allow us to assess the efficacy of ML algorithms on selected datasets. After evaluating various studies based on open-source and project datasets, this study evaluated additional threats and obstacles to code smell detection, such as the lack of standardized code smell definitions, the difficulty of feature selection, and the challenges of handling large-scale datasets. The current studies only considered a few factors in identifying code smells, while in this study, several potential contributing factors to code smells are included. Several ML algorithms are examined, and various approaches, datasets, dataset languages, and software metrics are presented. This study provides the potential of ML algorithms to produce better results and fills a gap in the body of knowledge by providing class-wise distributions of the ML algorithms. Support Vector Machine, J48, Naive Bayes, and Random Forest models are the most common for detecting code smells. Researchers can find this study helpful in better anticipating and taking care of software development design and implementation issues. The findings from this study, which highlight the practical implications of ML algorithms in software quality improvement, will help software engineers fix problems during software design and development to ensure software quality.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Method-Level Code Smells Detection Using Machine Learning Models

Cited by 3 publications

References 13 publications

Exploring the role of project status information in effective code smell detection

Exploring the role of project status information in effective code smell detection

Ensemble methods with feature selection and data balancing for improved code smells classification performance

Machine Learning-Based Methods for Code Smell Detection: A Survey

Contact Info

Product

Resources

About