Software Defect Prediction Using Wrapper Feature Selection Based on Dynamic Re-Ranking Strategy

Finding defects early in a software system is a crucial task, as it creates adequate time for fixing such defects using available resources. Strategies such as symmetric testing have proven useful; however, its inability in differentiating incorrect implementations from correct ones is a drawback. S...

Full description

Bibliographic Details
Main Authors: Abdullateef Oluwagbemiga Balogun, Shuib Basri, Luiz Fernando Capretz, Saipunidzam Mahamad, Abdullahi Abubakar Imam, Malek A. Almomani, Victor Elijah Adeyemo, Ammar K. Alazzawi, Amos Orenyi Bajeh, Ganesh Kumar
Format: Article
Language:English
Published: MDPI AG 2021-11-01
Series:Symmetry
Subjects:
Online Access:https://www.mdpi.com/2073-8994/13/11/2166
_version_ 1797508412477538304
author Abdullateef Oluwagbemiga Balogun
Shuib Basri
Luiz Fernando Capretz
Saipunidzam Mahamad
Abdullahi Abubakar Imam
Malek A. Almomani
Victor Elijah Adeyemo
Ammar K. Alazzawi
Amos Orenyi Bajeh
Ganesh Kumar
author_facet Abdullateef Oluwagbemiga Balogun
Shuib Basri
Luiz Fernando Capretz
Saipunidzam Mahamad
Abdullahi Abubakar Imam
Malek A. Almomani
Victor Elijah Adeyemo
Ammar K. Alazzawi
Amos Orenyi Bajeh
Ganesh Kumar
author_sort Abdullateef Oluwagbemiga Balogun
collection DOAJ
description Finding defects early in a software system is a crucial task, as it creates adequate time for fixing such defects using available resources. Strategies such as symmetric testing have proven useful; however, its inability in differentiating incorrect implementations from correct ones is a drawback. Software defect prediction (SDP) is another feasible method that can be used for detecting defects early. Additionally, high dimensionality, a data quality problem, has a detrimental effect on the predictive capability of SDP models. Feature selection (FS) has been used as a feasible solution for solving the high dimensionality issue in SDP. According to current literature, the two basic forms of FS approaches are filter-based feature selection (FFS) and wrapper-based feature selection (WFS). Between the two, WFS approaches have been deemed to be superior. However, WFS methods have a high computational cost due to the unknown number of executions available for feature subset search, evaluation, and selection. This characteristic of WFS often leads to overfitting of classifier models due to its easy trapping in local maxima. The trapping of the WFS subset evaluator in local maxima can be overcome by using an effective search method in the evaluator process. Hence, this study proposes an enhanced WFS method that dynamically and iteratively selects features. The proposed enhanced WFS (EWFS) method is based on incrementally selecting features while considering previously selected features in its search space. The novelty of EWFS is based on the enhancement of the subset evaluation process of WFS methods by deploying a dynamic re-ranking strategy that iteratively selects germane features with a low subset evaluation cycle while not compromising the prediction performance of the ensuing model. For evaluation, EWFS was deployed with Decision Tree (DT) and Naïve Bayes classifiers on software defect datasets with varying granularities. The experimental findings revealed that EWFS outperformed existing metaheuristics and sequential search-based WFS approaches established in this work. Additionally, EWFS selected fewer features with less computational time as compared with existing metaheuristics and sequential search-based WFS methods.
first_indexed 2024-03-10T05:01:44Z
format Article
id doaj.art-ff406c16769e473cba37a8ef07580c9b
institution Directory Open Access Journal
issn 2073-8994
language English
last_indexed 2024-03-10T05:01:44Z
publishDate 2021-11-01
publisher MDPI AG
record_format Article
series Symmetry
spelling doaj.art-ff406c16769e473cba37a8ef07580c9b2023-11-23T01:46:11ZengMDPI AGSymmetry2073-89942021-11-011311216610.3390/sym13112166Software Defect Prediction Using Wrapper Feature Selection Based on Dynamic Re-Ranking StrategyAbdullateef Oluwagbemiga Balogun0Shuib Basri1Luiz Fernando Capretz2Saipunidzam Mahamad3Abdullahi Abubakar Imam4Malek A. Almomani5Victor Elijah Adeyemo6Ammar K. Alazzawi7Amos Orenyi Bajeh8Ganesh Kumar9Department of Computer and Information Science, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, MalaysiaDepartment of Computer and Information Science, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, MalaysiaDepartment of Electrical and Computer Engineering, Western University, London, ON N6A 5B9, CanadaDepartment of Computer and Information Science, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, MalaysiaDepartment of Computer and Information Science, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, MalaysiaDepartment of Software Engineering, The World Islamic Sciences and Education University, Amman 11947, JordanSchool of Built Environment, Engineering and Computing, Headingley Campus, Leeds Beckett University, Leeds LS6 3QS, UKDepartment of Computer and Information Science, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, MalaysiaDepartment of Computer Science, University of Ilorin, Ilorin 1515, NigeriaDepartment of Computer and Information Science, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, MalaysiaFinding defects early in a software system is a crucial task, as it creates adequate time for fixing such defects using available resources. Strategies such as symmetric testing have proven useful; however, its inability in differentiating incorrect implementations from correct ones is a drawback. Software defect prediction (SDP) is another feasible method that can be used for detecting defects early. Additionally, high dimensionality, a data quality problem, has a detrimental effect on the predictive capability of SDP models. Feature selection (FS) has been used as a feasible solution for solving the high dimensionality issue in SDP. According to current literature, the two basic forms of FS approaches are filter-based feature selection (FFS) and wrapper-based feature selection (WFS). Between the two, WFS approaches have been deemed to be superior. However, WFS methods have a high computational cost due to the unknown number of executions available for feature subset search, evaluation, and selection. This characteristic of WFS often leads to overfitting of classifier models due to its easy trapping in local maxima. The trapping of the WFS subset evaluator in local maxima can be overcome by using an effective search method in the evaluator process. Hence, this study proposes an enhanced WFS method that dynamically and iteratively selects features. The proposed enhanced WFS (EWFS) method is based on incrementally selecting features while considering previously selected features in its search space. The novelty of EWFS is based on the enhancement of the subset evaluation process of WFS methods by deploying a dynamic re-ranking strategy that iteratively selects germane features with a low subset evaluation cycle while not compromising the prediction performance of the ensuing model. For evaluation, EWFS was deployed with Decision Tree (DT) and Naïve Bayes classifiers on software defect datasets with varying granularities. The experimental findings revealed that EWFS outperformed existing metaheuristics and sequential search-based WFS approaches established in this work. Additionally, EWFS selected fewer features with less computational time as compared with existing metaheuristics and sequential search-based WFS methods.https://www.mdpi.com/2073-8994/13/11/2166high dimensionalityre-ranking strategysoftware defect predictionwrapper feature methodCuckoo search method
spellingShingle Abdullateef Oluwagbemiga Balogun
Shuib Basri
Luiz Fernando Capretz
Saipunidzam Mahamad
Abdullahi Abubakar Imam
Malek A. Almomani
Victor Elijah Adeyemo
Ammar K. Alazzawi
Amos Orenyi Bajeh
Ganesh Kumar
Software Defect Prediction Using Wrapper Feature Selection Based on Dynamic Re-Ranking Strategy
Symmetry
high dimensionality
re-ranking strategy
software defect prediction
wrapper feature method
Cuckoo search method
title Software Defect Prediction Using Wrapper Feature Selection Based on Dynamic Re-Ranking Strategy
title_full Software Defect Prediction Using Wrapper Feature Selection Based on Dynamic Re-Ranking Strategy
title_fullStr Software Defect Prediction Using Wrapper Feature Selection Based on Dynamic Re-Ranking Strategy
title_full_unstemmed Software Defect Prediction Using Wrapper Feature Selection Based on Dynamic Re-Ranking Strategy
title_short Software Defect Prediction Using Wrapper Feature Selection Based on Dynamic Re-Ranking Strategy
title_sort software defect prediction using wrapper feature selection based on dynamic re ranking strategy
topic high dimensionality
re-ranking strategy
software defect prediction
wrapper feature method
Cuckoo search method
url https://www.mdpi.com/2073-8994/13/11/2166
work_keys_str_mv AT abdullateefoluwagbemigabalogun softwaredefectpredictionusingwrapperfeatureselectionbasedondynamicrerankingstrategy
AT shuibbasri softwaredefectpredictionusingwrapperfeatureselectionbasedondynamicrerankingstrategy
AT luizfernandocapretz softwaredefectpredictionusingwrapperfeatureselectionbasedondynamicrerankingstrategy
AT saipunidzammahamad softwaredefectpredictionusingwrapperfeatureselectionbasedondynamicrerankingstrategy
AT abdullahiabubakarimam softwaredefectpredictionusingwrapperfeatureselectionbasedondynamicrerankingstrategy
AT malekaalmomani softwaredefectpredictionusingwrapperfeatureselectionbasedondynamicrerankingstrategy
AT victorelijahadeyemo softwaredefectpredictionusingwrapperfeatureselectionbasedondynamicrerankingstrategy
AT ammarkalazzawi softwaredefectpredictionusingwrapperfeatureselectionbasedondynamicrerankingstrategy
AT amosorenyibajeh softwaredefectpredictionusingwrapperfeatureselectionbasedondynamicrerankingstrategy
AT ganeshkumar softwaredefectpredictionusingwrapperfeatureselectionbasedondynamicrerankingstrategy