Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance

Software defect prediction (SDP) is the technique used to predict the occurrences of defects in the early stages of software development process. Early prediction of defects will reduce the overall cost of software and also increase its reliability. Most of the defect prediction methods proposed in...

Full description

Bibliographic Details
Main Authors: Kiran Kumar Bejjanki, Jayadev Gyani, Narsimha Gugulothu
Format: Article
Language:English
Published: MDPI AG 2020-03-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/12/3/407
_version_ 1811187268760633344
author Kiran Kumar Bejjanki
Jayadev Gyani
Narsimha Gugulothu
author_facet Kiran Kumar Bejjanki
Jayadev Gyani
Narsimha Gugulothu
author_sort Kiran Kumar Bejjanki
collection DOAJ
description Software defect prediction (SDP) is the technique used to predict the occurrences of defects in the early stages of software development process. Early prediction of defects will reduce the overall cost of software and also increase its reliability. Most of the defect prediction methods proposed in the literature suffer from the class imbalance problem. In this paper, a novel class imbalance reduction (CIR) algorithm is proposed to create a symmetry between the defect and non-defect records in the imbalance datasets by considering distribution properties of the datasets and is compared with SMOTE (synthetic minority oversampling technique), a built-in package of many machine learning tools that is considered a benchmark in handling class imbalance problems, and with K-Means SMOTE. We conducted the experiment on forty open source software defect datasets from PRedict or Models in Software Engineering (PROMISE) repository using eight different classifiers and evaluated with six performance measures. The results show that the proposed CIR method shows improved performance over SMOTE and K-Means SMOTE.
first_indexed 2024-04-11T13:59:24Z
format Article
id doaj.art-7dc5c995ee694c90921c86ace8ad1c0d
institution Directory Open Access Journal
issn 2073-8994
language English
last_indexed 2024-04-11T13:59:24Z
publishDate 2020-03-01
publisher MDPI AG
record_format Article
series Symmetry
spelling doaj.art-7dc5c995ee694c90921c86ace8ad1c0d2022-12-22T04:20:10ZengMDPI AGSymmetry2073-89942020-03-0112340710.3390/sym12030407sym12030407Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class ImbalanceKiran Kumar Bejjanki0Jayadev Gyani1Narsimha Gugulothu2Department of Information Technology, Kakatiya Institute of Technology Science, Warangal 506015, IndiaDepartment of Computer Science, College of Computer and Information Sciences, Majmaah University, Al Majmaah 11952, Saudi ArabiaDepartment of CSE, JNTUH College of Engineering, Hyderabad 500085, IndiaSoftware defect prediction (SDP) is the technique used to predict the occurrences of defects in the early stages of software development process. Early prediction of defects will reduce the overall cost of software and also increase its reliability. Most of the defect prediction methods proposed in the literature suffer from the class imbalance problem. In this paper, a novel class imbalance reduction (CIR) algorithm is proposed to create a symmetry between the defect and non-defect records in the imbalance datasets by considering distribution properties of the datasets and is compared with SMOTE (synthetic minority oversampling technique), a built-in package of many machine learning tools that is considered a benchmark in handling class imbalance problems, and with K-Means SMOTE. We conducted the experiment on forty open source software defect datasets from PRedict or Models in Software Engineering (PROMISE) repository using eight different classifiers and evaluated with six performance measures. The results show that the proposed CIR method shows improved performance over SMOTE and K-Means SMOTE.https://www.mdpi.com/2073-8994/12/3/407software defect predictionoversamplingclass imbalanceclassificationsoftware metrics
spellingShingle Kiran Kumar Bejjanki
Jayadev Gyani
Narsimha Gugulothu
Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance
Symmetry
software defect prediction
oversampling
class imbalance
classification
software metrics
title Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance
title_full Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance
title_fullStr Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance
title_full_unstemmed Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance
title_short Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance
title_sort class imbalance reduction cir a novel approach to software defect prediction in the presence of class imbalance
topic software defect prediction
oversampling
class imbalance
classification
software metrics
url https://www.mdpi.com/2073-8994/12/3/407
work_keys_str_mv AT kirankumarbejjanki classimbalancereductionciranovelapproachtosoftwaredefectpredictioninthepresenceofclassimbalance
AT jayadevgyani classimbalancereductionciranovelapproachtosoftwaredefectpredictioninthepresenceofclassimbalance
AT narsimhagugulothu classimbalancereductionciranovelapproachtosoftwaredefectpredictioninthepresenceofclassimbalance