Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities

Understanding how developers refactor their code is critical to support the design improvement process of software. This paper investigates to what extent code metrics are good indicators for predicting refactoring activity in the source code. In order to perform this, we formulated the prediction o...

Full description

Bibliographic Details
Main Authors:	Priyadarshni Suresh Sagar, Eman Abdulah AlOmar, Mohamed Wiem Mkaouer, Ali Ouni, Christian D. Newman
Format:	Article
Language:	English
Published:	MDPI AG 2021-09-01
Series:	Algorithms
Subjects:	refactoring software quality commits software metrics software engineering
Online Access:	https://www.mdpi.com/1999-4893/14/10/289

_version_	1797515567491448832
author	Priyadarshni Suresh Sagar Eman Abdulah AlOmar Mohamed Wiem Mkaouer Ali Ouni Christian D. Newman
author_facet	Priyadarshni Suresh Sagar Eman Abdulah AlOmar Mohamed Wiem Mkaouer Ali Ouni Christian D. Newman
author_sort	Priyadarshni Suresh Sagar
collection	DOAJ
description	Understanding how developers refactor their code is critical to support the design improvement process of software. This paper investigates to what extent code metrics are good indicators for predicting refactoring activity in the source code. In order to perform this, we formulated the prediction of refactoring operation types as a multi-class classification problem. Our solution relies on measuring metrics extracted from committed code changes in order to extract the corresponding features (i.e., metric variations) that better represent each class (i.e., refactoring type) in order to automatically predict, for a given commit, the method-level type of refactoring being applied, namely <i>Move Method</i>, <i>Rename Method</i>, <i>Extract Method</i>, <i>Inline Method</i>, <i>Pull-up Method</i>, and <i>Push-down Method</i>. We compared various classifiers, in terms of their prediction performance, using a dataset of 5004 commits and extracted 800 Java projects. Our main findings show that the random forest model trained with code metrics resulted in the best average accuracy of 75%. However, we detected a variation in the results per class, which means that some refactoring types are harder to detect than others.
first_indexed	2024-03-10T06:47:14Z
format	Article
id	doaj.art-5dc933a72ecd424f8d1bb0abe2db482d
institution	Directory Open Access Journal
issn	1999-4893
language	English
last_indexed	2024-03-10T06:47:14Z
publishDate	2021-09-01
publisher	MDPI AG
record_format	Article
series	Algorithms
spelling	doaj.art-5dc933a72ecd424f8d1bb0abe2db482d2023-11-22T17:08:24ZengMDPI AGAlgorithms1999-48932021-09-01141028910.3390/a14100289Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring ActivitiesPriyadarshni Suresh Sagar0Eman Abdulah AlOmar1Mohamed Wiem Mkaouer2Ali Ouni3Christian D. Newman4Rochester Institute of Technology, Rochester, New York, NY 14623, USARochester Institute of Technology, Rochester, New York, NY 14623, USARochester Institute of Technology, Rochester, New York, NY 14623, USAEcole de Technologie Superieure, University of Quebec, Quebec City, QC H3C 1K3, CanadaRochester Institute of Technology, Rochester, New York, NY 14623, USAUnderstanding how developers refactor their code is critical to support the design improvement process of software. This paper investigates to what extent code metrics are good indicators for predicting refactoring activity in the source code. In order to perform this, we formulated the prediction of refactoring operation types as a multi-class classification problem. Our solution relies on measuring metrics extracted from committed code changes in order to extract the corresponding features (i.e., metric variations) that better represent each class (i.e., refactoring type) in order to automatically predict, for a given commit, the method-level type of refactoring being applied, namely <i>Move Method</i>, <i>Rename Method</i>, <i>Extract Method</i>, <i>Inline Method</i>, <i>Pull-up Method</i>, and <i>Push-down Method</i>. We compared various classifiers, in terms of their prediction performance, using a dataset of 5004 commits and extracted 800 Java projects. Our main findings show that the random forest model trained with code metrics resulted in the best average accuracy of 75%. However, we detected a variation in the results per class, which means that some refactoring types are harder to detect than others.https://www.mdpi.com/1999-4893/14/10/289refactoringsoftware qualitycommitssoftware metricssoftware engineering
spellingShingle	Priyadarshni Suresh Sagar Eman Abdulah AlOmar Mohamed Wiem Mkaouer Ali Ouni Christian D. Newman Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities Algorithms refactoring software quality commits software metrics software engineering
title	Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities
title_full	Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities
title_fullStr	Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities
title_full_unstemmed	Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities
title_short	Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities
title_sort	comparing commit messages and source code metrics for the prediction refactoring activities
topic	refactoring software quality commits software metrics software engineering
url	https://www.mdpi.com/1999-4893/14/10/289
work_keys_str_mv	AT priyadarshnisureshsagar comparingcommitmessagesandsourcecodemetricsforthepredictionrefactoringactivities AT emanabdulahalomar comparingcommitmessagesandsourcecodemetricsforthepredictionrefactoringactivities AT mohamedwiemmkaouer comparingcommitmessagesandsourcecodemetricsforthepredictionrefactoringactivities AT aliouni comparingcommitmessagesandsourcecodemetricsforthepredictionrefactoringactivities AT christiandnewman comparingcommitmessagesandsourcecodemetricsforthepredictionrefactoringactivities

Comparing Commit Messages and Source Code Metrics for the Prediction Refactoring Activities

Similar Items