Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance
Software defect prediction (SDP) is the technique used to predict the occurrences of defects in the early stages of software development process. Early prediction of defects will reduce the overall cost of software and also increase its reliability. Most of the defect prediction methods proposed in...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-03-01
|
Series: | Symmetry |
Subjects: | |
Online Access: | https://www.mdpi.com/2073-8994/12/3/407 |
_version_ | 1811187268760633344 |
---|---|
author | Kiran Kumar Bejjanki Jayadev Gyani Narsimha Gugulothu |
author_facet | Kiran Kumar Bejjanki Jayadev Gyani Narsimha Gugulothu |
author_sort | Kiran Kumar Bejjanki |
collection | DOAJ |
description | Software defect prediction (SDP) is the technique used to predict the occurrences of defects in the early stages of software development process. Early prediction of defects will reduce the overall cost of software and also increase its reliability. Most of the defect prediction methods proposed in the literature suffer from the class imbalance problem. In this paper, a novel class imbalance reduction (CIR) algorithm is proposed to create a symmetry between the defect and non-defect records in the imbalance datasets by considering distribution properties of the datasets and is compared with SMOTE (synthetic minority oversampling technique), a built-in package of many machine learning tools that is considered a benchmark in handling class imbalance problems, and with K-Means SMOTE. We conducted the experiment on forty open source software defect datasets from PRedict or Models in Software Engineering (PROMISE) repository using eight different classifiers and evaluated with six performance measures. The results show that the proposed CIR method shows improved performance over SMOTE and K-Means SMOTE. |
first_indexed | 2024-04-11T13:59:24Z |
format | Article |
id | doaj.art-7dc5c995ee694c90921c86ace8ad1c0d |
institution | Directory Open Access Journal |
issn | 2073-8994 |
language | English |
last_indexed | 2024-04-11T13:59:24Z |
publishDate | 2020-03-01 |
publisher | MDPI AG |
record_format | Article |
series | Symmetry |
spelling | doaj.art-7dc5c995ee694c90921c86ace8ad1c0d2022-12-22T04:20:10ZengMDPI AGSymmetry2073-89942020-03-0112340710.3390/sym12030407sym12030407Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class ImbalanceKiran Kumar Bejjanki0Jayadev Gyani1Narsimha Gugulothu2Department of Information Technology, Kakatiya Institute of Technology Science, Warangal 506015, IndiaDepartment of Computer Science, College of Computer and Information Sciences, Majmaah University, Al Majmaah 11952, Saudi ArabiaDepartment of CSE, JNTUH College of Engineering, Hyderabad 500085, IndiaSoftware defect prediction (SDP) is the technique used to predict the occurrences of defects in the early stages of software development process. Early prediction of defects will reduce the overall cost of software and also increase its reliability. Most of the defect prediction methods proposed in the literature suffer from the class imbalance problem. In this paper, a novel class imbalance reduction (CIR) algorithm is proposed to create a symmetry between the defect and non-defect records in the imbalance datasets by considering distribution properties of the datasets and is compared with SMOTE (synthetic minority oversampling technique), a built-in package of many machine learning tools that is considered a benchmark in handling class imbalance problems, and with K-Means SMOTE. We conducted the experiment on forty open source software defect datasets from PRedict or Models in Software Engineering (PROMISE) repository using eight different classifiers and evaluated with six performance measures. The results show that the proposed CIR method shows improved performance over SMOTE and K-Means SMOTE.https://www.mdpi.com/2073-8994/12/3/407software defect predictionoversamplingclass imbalanceclassificationsoftware metrics |
spellingShingle | Kiran Kumar Bejjanki Jayadev Gyani Narsimha Gugulothu Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance Symmetry software defect prediction oversampling class imbalance classification software metrics |
title | Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance |
title_full | Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance |
title_fullStr | Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance |
title_full_unstemmed | Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance |
title_short | Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance |
title_sort | class imbalance reduction cir a novel approach to software defect prediction in the presence of class imbalance |
topic | software defect prediction oversampling class imbalance classification software metrics |
url | https://www.mdpi.com/2073-8994/12/3/407 |
work_keys_str_mv | AT kirankumarbejjanki classimbalancereductionciranovelapproachtosoftwaredefectpredictioninthepresenceofclassimbalance AT jayadevgyani classimbalancereductionciranovelapproachtosoftwaredefectpredictioninthepresenceofclassimbalance AT narsimhagugulothu classimbalancereductionciranovelapproachtosoftwaredefectpredictioninthepresenceofclassimbalance |