Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning
Cross-project defect prediction (CPDP) aims to predict the defect proneness of target project with the defect data of source project. Existing CPDP methods are based on the assumption that source and target projects should have the same metrics. Heterogeneous cross-project defect prediction (HCPDP)...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
AIMS Press
2020-01-01
|
Series: | Mathematical Biosciences and Engineering |
Subjects: | |
Online Access: | https://www.aimspress.com/article/doi/10.3934/mbe.2020054?viewType=HTML |
_version_ | 1818609680064184320 |
---|---|
author | Xinglong Yin Lei Liu Huaxiao Liu Qi Wu |
author_facet | Xinglong Yin Lei Liu Huaxiao Liu Qi Wu |
author_sort | Xinglong Yin |
collection | DOAJ |
description | Cross-project defect prediction (CPDP) aims to predict the defect proneness of target project with the defect data of source project. Existing CPDP methods are based on the assumption that source and target projects should have the same metrics. Heterogeneous cross-project defect prediction (HCPDP) builds a prediction model using heterogeneous source and target projects. Existing HCPDP methods just focus on one source project or multiple source projects with the same metrics. These methods limit the scope of getting the source project. In this paper, we propose Heterogeneous Defect Prediction with Multiple source projects (HDPM) which can use multiple heterogeneous source projects for defect prediction. HDPM based on transfer learning which can learn knowledge from one domain and use it to help with other domain. HDPM constructs a projective matrix between heterogeneous source and target projects to make the distributions of source and target projects similar. We conduct experiments on 14 projects from four public datasets and the results show that HDPM can achieve better performance compared with existing CPDP methods, and outperforms or is comparable to within-project defect prediction method. The use of multiple heterogeneous source projects for defect prediction can effectively extend the data acquisition range of defect prediction and make software defect prediction better applied to software engineering. |
first_indexed | 2024-12-16T15:02:23Z |
format | Article |
id | doaj.art-89a025c999264a2fa3c15a826e0a6162 |
institution | Directory Open Access Journal |
issn | 1551-0018 |
language | English |
last_indexed | 2024-12-16T15:02:23Z |
publishDate | 2020-01-01 |
publisher | AIMS Press |
record_format | Article |
series | Mathematical Biosciences and Engineering |
spelling | doaj.art-89a025c999264a2fa3c15a826e0a61622022-12-21T22:27:15ZengAIMS PressMathematical Biosciences and Engineering1551-00182020-01-011721020104010.3934/mbe.2020054Heterogeneous cross-project defect prediction with multiple source projects based on transfer learningXinglong Yin0Lei Liu1Huaxiao Liu2Qi Wu3Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaKey Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, ChinaCross-project defect prediction (CPDP) aims to predict the defect proneness of target project with the defect data of source project. Existing CPDP methods are based on the assumption that source and target projects should have the same metrics. Heterogeneous cross-project defect prediction (HCPDP) builds a prediction model using heterogeneous source and target projects. Existing HCPDP methods just focus on one source project or multiple source projects with the same metrics. These methods limit the scope of getting the source project. In this paper, we propose Heterogeneous Defect Prediction with Multiple source projects (HDPM) which can use multiple heterogeneous source projects for defect prediction. HDPM based on transfer learning which can learn knowledge from one domain and use it to help with other domain. HDPM constructs a projective matrix between heterogeneous source and target projects to make the distributions of source and target projects similar. We conduct experiments on 14 projects from four public datasets and the results show that HDPM can achieve better performance compared with existing CPDP methods, and outperforms or is comparable to within-project defect prediction method. The use of multiple heterogeneous source projects for defect prediction can effectively extend the data acquisition range of defect prediction and make software defect prediction better applied to software engineering.https://www.aimspress.com/article/doi/10.3934/mbe.2020054?viewType=HTMLdefect predictionheterogeneous metricsmultiple heterogeneous source projectstransfer learning |
spellingShingle | Xinglong Yin Lei Liu Huaxiao Liu Qi Wu Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning Mathematical Biosciences and Engineering defect prediction heterogeneous metrics multiple heterogeneous source projects transfer learning |
title | Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning |
title_full | Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning |
title_fullStr | Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning |
title_full_unstemmed | Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning |
title_short | Heterogeneous cross-project defect prediction with multiple source projects based on transfer learning |
title_sort | heterogeneous cross project defect prediction with multiple source projects based on transfer learning |
topic | defect prediction heterogeneous metrics multiple heterogeneous source projects transfer learning |
url | https://www.aimspress.com/article/doi/10.3934/mbe.2020054?viewType=HTML |
work_keys_str_mv | AT xinglongyin heterogeneouscrossprojectdefectpredictionwithmultiplesourceprojectsbasedontransferlearning AT leiliu heterogeneouscrossprojectdefectpredictionwithmultiplesourceprojectsbasedontransferlearning AT huaxiaoliu heterogeneouscrossprojectdefectpredictionwithmultiplesourceprojectsbasedontransferlearning AT qiwu heterogeneouscrossprojectdefectpredictionwithmultiplesourceprojectsbasedontransferlearning |