Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning

Cross-project defect prediction (CPDP) aims to predict the defect proneness of target project with the defect data of source project. Existing CPDP methods are based on the assumption that source and target projects should have the same metrics. Heterogeneous cross-project defect prediction (HCPDP)...

Full description

Bibliographic Details
Main Authors: Xinglong Yin, Lei Liu, Huaxiao Liu, Qi Wu
Format: Article
Language:English
Published: AIMS Press 2020-01-01
Series:Mathematical Biosciences and Engineering
Subjects:
Online Access:https://www.aimspress.com/article/doi/10.3934/mbe.2020054?viewType=HTML
_version_ 1818609680064184320
author Xinglong Yin
Lei Liu
Huaxiao Liu
Qi Wu
author_facet Xinglong Yin
Lei Liu
Huaxiao Liu
Qi Wu
author_sort Xinglong Yin
collection DOAJ
description Cross-project defect prediction (CPDP) aims to predict the defect proneness of target project with the defect data of source project. Existing CPDP methods are based on the assumption that source and target projects should have the same metrics. Heterogeneous cross-project defect prediction (HCPDP) builds a prediction model using heterogeneous source and target projects. Existing HCPDP methods just focus on one source project or multiple source projects with the same metrics. These methods limit the scope of getting the source project. In this paper, we propose Heterogeneous Defect Prediction with Multiple source projects (HDPM) which can use multiple heterogeneous source projects for defect prediction. HDPM based on transfer learning which can learn knowledge from one domain and use it to help with other domain. HDPM constructs a projective matrix between heterogeneous source and target projects to make the distributions of source and target projects similar. We conduct experiments on 14 projects from four public datasets and the results show that HDPM can achieve better performance compared with existing CPDP methods, and outperforms or is comparable to within-project defect prediction method. The use of multiple heterogeneous source projects for defect prediction can effectively extend the data acquisition range of defect prediction and make software defect prediction better applied to software engineering.
first_indexed 2024-12-16T15:02:23Z
format Article
id doaj.art-89a025c999264a2fa3c15a826e0a6162
institution Directory Open Access Journal
issn 1551-0018
language English
last_indexed 2024-12-16T15:02:23Z
publishDate 2020-01-01
publisher AIMS Press
record_format Article
series Mathematical Biosciences and Engineering
spelling doaj.art-89a025c999264a2fa3c15a826e0a61622022-12-21T22:27:15ZengAIMS PressMathematical Biosciences and Engineering1551-00182020-01-011721020104010.3934/mbe.2020054Heterogeneous cross-project defect prediction with multiple source projects based on transfer learningXinglong Yin0Lei Liu1Huaxiao Liu2Qi Wu3Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaCross-project defect prediction (CPDP) aims to predict the defect proneness of target project with the defect data of source project. Existing CPDP methods are based on the assumption that source and target projects should have the same metrics. Heterogeneous cross-project defect prediction (HCPDP) builds a prediction model using heterogeneous source and target projects. Existing HCPDP methods just focus on one source project or multiple source projects with the same metrics. These methods limit the scope of getting the source project. In this paper, we propose Heterogeneous Defect Prediction with Multiple source projects (HDPM) which can use multiple heterogeneous source projects for defect prediction. HDPM based on transfer learning which can learn knowledge from one domain and use it to help with other domain. HDPM constructs a projective matrix between heterogeneous source and target projects to make the distributions of source and target projects similar. We conduct experiments on 14 projects from four public datasets and the results show that HDPM can achieve better performance compared with existing CPDP methods, and outperforms or is comparable to within-project defect prediction method. The use of multiple heterogeneous source projects for defect prediction can effectively extend the data acquisition range of defect prediction and make software defect prediction better applied to software engineering.https://www.aimspress.com/article/doi/10.3934/mbe.2020054?viewType=HTMLdefect predictionheterogeneous metricsmultiple heterogeneous source projectstransfer learning
spellingShingle Xinglong Yin
Lei Liu
Huaxiao Liu
Qi Wu
Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning
Mathematical Biosciences and Engineering
defect prediction
heterogeneous metrics
multiple heterogeneous source projects
transfer learning
title Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning
title_full Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning
title_fullStr Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning
title_full_unstemmed Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning
title_short Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning
title_sort heterogeneous cross project defect prediction with multiple source projects based on transfer learning
topic defect prediction
heterogeneous metrics
multiple heterogeneous source projects
transfer learning
url https://www.aimspress.com/article/doi/10.3934/mbe.2020054?viewType=HTML
work_keys_str_mv AT xinglongyin heterogeneouscrossprojectdefectpredictionwithmultiplesourceprojectsbasedontransferlearning
AT leiliu heterogeneouscrossprojectdefectpredictionwithmultiplesourceprojectsbasedontransferlearning
AT huaxiaoliu heterogeneouscrossprojectdefectpredictionwithmultiplesourceprojectsbasedontransferlearning
AT qiwu heterogeneouscrossprojectdefectpredictionwithmultiplesourceprojectsbasedontransferlearning