Code Smell Detection Using Ensemble Machine Learning Algorithms

Code smells are the result of not following software engineering principles during software development, especially in the design and coding phase. It leads to low maintainability. To evaluate the quality of software and its maintainability, code smell detection can be helpful. Many machine learning...

Full description

Bibliographic Details
Main Authors:	Seema Dewangan, Rajwant Singh Rao, Alok Mishra, Manjari Gupta
Format:	Article
Language:	English
Published:	MDPI AG 2022-10-01
Series:	Applied Sciences
Subjects:	code smell code smell detection ensemble method deep learning Chi-square feature extraction technique SMOTE class balancing technique
Online Access:	https://www.mdpi.com/2076-3417/12/20/10321

_version_	1797475647351685120
author	Seema Dewangan Rajwant Singh Rao Alok Mishra Manjari Gupta
author_facet	Seema Dewangan Rajwant Singh Rao Alok Mishra Manjari Gupta
author_sort	Seema Dewangan
collection	DOAJ
description	Code smells are the result of not following software engineering principles during software development, especially in the design and coding phase. It leads to low maintainability. To evaluate the quality of software and its maintainability, code smell detection can be helpful. Many machine learning algorithms are being used to detect code smells. In this study, we applied five ensemble machine learning and two deep learning algorithms to detect code smells. Four code smell datasets were analyzed: the Data class, the God class, the Feature-envy, and the Long-method datasets. In previous works, machine learning and stacking ensemble learning algorithms were applied to this dataset and the results found were acceptable, but there is scope of improvement. A class balancing technique (SMOTE) was applied to handle the class imbalance problem in the datasets. The Chi-square feature extraction technique was applied to select the more relevant features in each dataset. All five algorithms obtained the highest accuracy—100% for the Long-method dataset with the different selected sets of metrics, and the poorest accuracy, 91.45%, was achieved by the Max voting method for the Feature-envy dataset for the selected twelve sets of metrics.
first_indexed	2024-03-09T20:48:01Z
format	Article
id	doaj.art-ceadbaab626148909d67c379681a3ee8
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-09T20:48:01Z
publishDate	2022-10-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-ceadbaab626148909d67c379681a3ee82023-11-23T22:42:44ZengMDPI AGApplied Sciences2076-34172022-10-0112201032110.3390/app122010321Code Smell Detection Using Ensemble Machine Learning AlgorithmsSeema Dewangan0Rajwant Singh Rao1Alok Mishra2Manjari Gupta3Department of Computer Science and Information Technology, Guru Ghasidas Vishwavidyalaya, Bilaspur 495009, IndiaDepartment of Computer Science and Information Technology, Guru Ghasidas Vishwavidyalaya, Bilaspur 495009, IndiaInformatics and Digitalization Group, Faculty of Logistics, Molde University College—Specialized University in Logistics, 6410 Molde, NorwayComputer Science, DST—Centre for Interdisciplinary Mathematical Sciences, Institute of Science, Banaras Hindu University, Varanasi 221005, IndiaCode smells are the result of not following software engineering principles during software development, especially in the design and coding phase. It leads to low maintainability. To evaluate the quality of software and its maintainability, code smell detection can be helpful. Many machine learning algorithms are being used to detect code smells. In this study, we applied five ensemble machine learning and two deep learning algorithms to detect code smells. Four code smell datasets were analyzed: the Data class, the God class, the Feature-envy, and the Long-method datasets. In previous works, machine learning and stacking ensemble learning algorithms were applied to this dataset and the results found were acceptable, but there is scope of improvement. A class balancing technique (SMOTE) was applied to handle the class imbalance problem in the datasets. The Chi-square feature extraction technique was applied to select the more relevant features in each dataset. All five algorithms obtained the highest accuracy—100% for the Long-method dataset with the different selected sets of metrics, and the poorest accuracy, 91.45%, was achieved by the Max voting method for the Feature-envy dataset for the selected twelve sets of metrics.https://www.mdpi.com/2076-3417/12/20/10321code smellcode smell detectionensemble methoddeep learningChi-square feature extraction techniqueSMOTE class balancing technique
spellingShingle	Seema Dewangan Rajwant Singh Rao Alok Mishra Manjari Gupta Code Smell Detection Using Ensemble Machine Learning Algorithms Applied Sciences code smell code smell detection ensemble method deep learning Chi-square feature extraction technique SMOTE class balancing technique
title	Code Smell Detection Using Ensemble Machine Learning Algorithms
title_full	Code Smell Detection Using Ensemble Machine Learning Algorithms
title_fullStr	Code Smell Detection Using Ensemble Machine Learning Algorithms
title_full_unstemmed	Code Smell Detection Using Ensemble Machine Learning Algorithms
title_short	Code Smell Detection Using Ensemble Machine Learning Algorithms
title_sort	code smell detection using ensemble machine learning algorithms
topic	code smell code smell detection ensemble method deep learning Chi-square feature extraction technique SMOTE class balancing technique
url	https://www.mdpi.com/2076-3417/12/20/10321
work_keys_str_mv	AT seemadewangan codesmelldetectionusingensemblemachinelearningalgorithms AT rajwantsinghrao codesmelldetectionusingensemblemachinelearningalgorithms AT alokmishra codesmelldetectionusingensemblemachinelearningalgorithms AT manjarigupta codesmelldetectionusingensemblemachinelearningalgorithms

Code Smell Detection Using Ensemble Machine Learning Algorithms

Similar Items