FedCSD: A Federated Learning Based Approach for Code-Smell Detection

Software quality is critical, as low quality, or “Code smell,” increases technical debt and maintenance costs. There is a timely need for a collaborative model that detects and manages code smells by learning from diverse and distributed data sources while respecting privacy an...

Full description

Bibliographic Details
Main Authors: Sadi Alawadi, Khalid Alkharabsheh, Fahed Alkhabbas, Victor R. Kebande, Feras M. Awaysheh, Fabio Palomba, Mohammed Awad
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10477413/
_version_ 1797231187536642048
author Sadi Alawadi
Khalid Alkharabsheh
Fahed Alkhabbas
Victor R. Kebande
Feras M. Awaysheh
Fabio Palomba
Mohammed Awad
author_facet Sadi Alawadi
Khalid Alkharabsheh
Fahed Alkhabbas
Victor R. Kebande
Feras M. Awaysheh
Fabio Palomba
Mohammed Awad
author_sort Sadi Alawadi
collection DOAJ
description Software quality is critical, as low quality, or “Code smell,” increases technical debt and maintenance costs. There is a timely need for a collaborative model that detects and manages code smells by learning from diverse and distributed data sources while respecting privacy and providing a scalable solution for continuously integrating new patterns and practices in code quality management. However, the current literature is still missing such capabilities. This paper addresses the previous challenges by proposing a Federated Learning Code Smell Detection (FedCSD) approach, specifically targeting “God Class,” to enable organizations to train distributed ML models while safeguarding data privacy collaboratively. We conduct experiments using manually validated datasets to detect and analyze code smell scenarios to validate our approach. Experiment 1, a centralized training experiment, revealed varying accuracies across datasets, with dataset two achieving the lowest accuracy (92.30%) and datasets one and three achieving the highest (98.90% and 99.5%, respectively). Experiment 2, focusing on cross-evaluation, showed a significant drop in accuracy (lowest: 63.80%) when fewer smells were present in the training dataset, reflecting technical debt. Experiment 3 involved splitting the dataset across 10 companies, resulting in a global model accuracy of 98.34%, comparable to the centralized model’s highest accuracy. The application of federated ML techniques demonstrates promising performance improvements in code-smell detection, benefiting both software developers and researchers.
first_indexed 2024-04-24T15:40:24Z
format Article
id doaj.art-25e1d0a0fe8546fbb329d4970f0edef5
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-24T15:40:24Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-25e1d0a0fe8546fbb329d4970f0edef52024-04-01T23:00:32ZengIEEEIEEE Access2169-35362024-01-0112448884490410.1109/ACCESS.2024.338016710477413FedCSD: A Federated Learning Based Approach for Code-Smell DetectionSadi Alawadi0https://orcid.org/0000-0002-6309-2892Khalid Alkharabsheh1https://orcid.org/0000-0002-3182-418XFahed Alkhabbas2https://orcid.org/0000-0002-8025-4734Victor R. Kebande3https://orcid.org/0000-0003-4071-4596Feras M. Awaysheh4Fabio Palomba5https://orcid.org/0000-0001-9337-5116Mohammed Awad6https://orcid.org/0000-0002-5053-0785Department of Computer Science, Blekinge Institute of Technology, Karlskrona, SwedenSoftware Engineering Department, Prince Abdullah bin Ghazi Faculty of Information and Communication Technology, Al-Balqa Applied University, As-Salt, JordanInternet of Things and People Research Center, Malmö University, Malmö, SwedenDepartment of Computer Science, Blekinge Institute of Technology, Karlskrona, SwedenInstitute of Computer Science, Delta Research Centre, University of Tartu, Tartu, EstoniaDepartment of Computer Science, University of Salerno, Fisciano, ItalyDepartment of Computer Systems Engineering, Arab American University, Jenin, PalestineSoftware quality is critical, as low quality, or “Code smell,” increases technical debt and maintenance costs. There is a timely need for a collaborative model that detects and manages code smells by learning from diverse and distributed data sources while respecting privacy and providing a scalable solution for continuously integrating new patterns and practices in code quality management. However, the current literature is still missing such capabilities. This paper addresses the previous challenges by proposing a Federated Learning Code Smell Detection (FedCSD) approach, specifically targeting “God Class,” to enable organizations to train distributed ML models while safeguarding data privacy collaboratively. We conduct experiments using manually validated datasets to detect and analyze code smell scenarios to validate our approach. Experiment 1, a centralized training experiment, revealed varying accuracies across datasets, with dataset two achieving the lowest accuracy (92.30%) and datasets one and three achieving the highest (98.90% and 99.5%, respectively). Experiment 2, focusing on cross-evaluation, showed a significant drop in accuracy (lowest: 63.80%) when fewer smells were present in the training dataset, reflecting technical debt. Experiment 3 involved splitting the dataset across 10 companies, resulting in a global model accuracy of 98.34%, comparable to the centralized model’s highest accuracy. The application of federated ML techniques demonstrates promising performance improvements in code-smell detection, benefiting both software developers and researchers.https://ieeexplore.ieee.org/document/10477413/Software qualitytechnical debitfederated learningprivacy-preservingcode smell detection
spellingShingle Sadi Alawadi
Khalid Alkharabsheh
Fahed Alkhabbas
Victor R. Kebande
Feras M. Awaysheh
Fabio Palomba
Mohammed Awad
FedCSD: A Federated Learning Based Approach for Code-Smell Detection
IEEE Access
Software quality
technical debit
federated learning
privacy-preserving
code smell detection
title FedCSD: A Federated Learning Based Approach for Code-Smell Detection
title_full FedCSD: A Federated Learning Based Approach for Code-Smell Detection
title_fullStr FedCSD: A Federated Learning Based Approach for Code-Smell Detection
title_full_unstemmed FedCSD: A Federated Learning Based Approach for Code-Smell Detection
title_short FedCSD: A Federated Learning Based Approach for Code-Smell Detection
title_sort fedcsd a federated learning based approach for code smell detection
topic Software quality
technical debit
federated learning
privacy-preserving
code smell detection
url https://ieeexplore.ieee.org/document/10477413/
work_keys_str_mv AT sadialawadi fedcsdafederatedlearningbasedapproachforcodesmelldetection
AT khalidalkharabsheh fedcsdafederatedlearningbasedapproachforcodesmelldetection
AT fahedalkhabbas fedcsdafederatedlearningbasedapproachforcodesmelldetection
AT victorrkebande fedcsdafederatedlearningbasedapproachforcodesmelldetection
AT ferasmawaysheh fedcsdafederatedlearningbasedapproachforcodesmelldetection
AT fabiopalomba fedcsdafederatedlearningbasedapproachforcodesmelldetection
AT mohammedawad fedcsdafederatedlearningbasedapproachforcodesmelldetection