An Improved CNN Model for Within-Project Software Defect Prediction

To improve software reliability, software defect prediction is used to find software bugs and prioritize testing efforts. Recently, some researchers introduced deep learning models, such as the deep belief network (DBN) and the state-of-the-art convolutional neural network (CNN), and used automatica...

Full description

Bibliographic Details
Main Authors:	Cong Pan, Minyan Lu, Biao Xu, Houleng Gao
Format:	Article
Language:	English
Published:	MDPI AG 2019-05-01
Series:	Applied Sciences
Subjects:	CNN model within-project defect prediction abstract syntax tree deep learning hyperparameter instability
Online Access:	https://www.mdpi.com/2076-3417/9/10/2138

_version_	1828349030257655808
author	Cong Pan Minyan Lu Biao Xu Houleng Gao
author_facet	Cong Pan Minyan Lu Biao Xu Houleng Gao
author_sort	Cong Pan
collection	DOAJ
description	To improve software reliability, software defect prediction is used to find software bugs and prioritize testing efforts. Recently, some researchers introduced deep learning models, such as the deep belief network (DBN) and the state-of-the-art convolutional neural network (CNN), and used automatically generated features extracted from abstract syntax trees (ASTs) and deep learning models to improve defect prediction performance. However, the research on the CNN model failed to reveal clear conclusions due to its limited dataset size, insufficiently repeated experiments, and outdated baseline selection. To solve these problems, we built the PROMISE Source Code (PSC) dataset to enlarge the original dataset in the CNN research, which we named the Simplified PROMISE Source Code (SPSC) dataset. Then, we proposed an improved CNN model for within-project defect prediction (WPDP) and compared our results to existing CNN results and an empirical study. Our experiment was based on a 30-repetition holdout validation and a 10 * 10 cross-validation. Experimental results showed that our improved CNN model was comparable to the existing CNN model, and it outperformed the state-of-the-art machine learning models significantly for WPDP. Furthermore, we defined hyperparameter instability and examined the threat and opportunity it presents for deep learning models on defect prediction.
first_indexed	2024-04-14T01:03:32Z
format	Article
id	doaj.art-9e5cb88b7b37415890b379171c59d9f2
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-04-14T01:03:32Z
publishDate	2019-05-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-9e5cb88b7b37415890b379171c59d9f22022-12-22T02:21:19ZengMDPI AGApplied Sciences2076-34172019-05-01910213810.3390/app9102138app9102138An Improved CNN Model for Within-Project Software Defect PredictionCong Pan0Minyan Lu1Biao Xu2Houleng Gao3The Key Laboratory on Reliability and Environmental Engineering Technology, Beihang University, Beijing 100191, ChinaThe Key Laboratory on Reliability and Environmental Engineering Technology, Beihang University, Beijing 100191, ChinaThe Key Laboratory on Reliability and Environmental Engineering Technology, Beihang University, Beijing 100191, ChinaThe Key Laboratory on Reliability and Environmental Engineering Technology, Beihang University, Beijing 100191, ChinaTo improve software reliability, software defect prediction is used to find software bugs and prioritize testing efforts. Recently, some researchers introduced deep learning models, such as the deep belief network (DBN) and the state-of-the-art convolutional neural network (CNN), and used automatically generated features extracted from abstract syntax trees (ASTs) and deep learning models to improve defect prediction performance. However, the research on the CNN model failed to reveal clear conclusions due to its limited dataset size, insufficiently repeated experiments, and outdated baseline selection. To solve these problems, we built the PROMISE Source Code (PSC) dataset to enlarge the original dataset in the CNN research, which we named the Simplified PROMISE Source Code (SPSC) dataset. Then, we proposed an improved CNN model for within-project defect prediction (WPDP) and compared our results to existing CNN results and an empirical study. Our experiment was based on a 30-repetition holdout validation and a 10 * 10 cross-validation. Experimental results showed that our improved CNN model was comparable to the existing CNN model, and it outperformed the state-of-the-art machine learning models significantly for WPDP. Furthermore, we defined hyperparameter instability and examined the threat and opportunity it presents for deep learning models on defect prediction.https://www.mdpi.com/2076-3417/9/10/2138CNN modelwithin-project defect predictionabstract syntax treedeep learninghyperparameter instability
spellingShingle	Cong Pan Minyan Lu Biao Xu Houleng Gao An Improved CNN Model for Within-Project Software Defect Prediction Applied Sciences CNN model within-project defect prediction abstract syntax tree deep learning hyperparameter instability
title	An Improved CNN Model for Within-Project Software Defect Prediction
title_full	An Improved CNN Model for Within-Project Software Defect Prediction
title_fullStr	An Improved CNN Model for Within-Project Software Defect Prediction
title_full_unstemmed	An Improved CNN Model for Within-Project Software Defect Prediction
title_short	An Improved CNN Model for Within-Project Software Defect Prediction
title_sort	improved cnn model for within project software defect prediction
topic	CNN model within-project defect prediction abstract syntax tree deep learning hyperparameter instability
url	https://www.mdpi.com/2076-3417/9/10/2138
work_keys_str_mv	AT congpan animprovedcnnmodelforwithinprojectsoftwaredefectprediction AT minyanlu animprovedcnnmodelforwithinprojectsoftwaredefectprediction AT biaoxu animprovedcnnmodelforwithinprojectsoftwaredefectprediction AT houlenggao animprovedcnnmodelforwithinprojectsoftwaredefectprediction AT congpan improvedcnnmodelforwithinprojectsoftwaredefectprediction AT minyanlu improvedcnnmodelforwithinprojectsoftwaredefectprediction AT biaoxu improvedcnnmodelforwithinprojectsoftwaredefectprediction AT houlenggao improvedcnnmodelforwithinprojectsoftwaredefectprediction

An Improved CNN Model for Within-Project Software Defect Prediction

Similar Items