Using Domain Adaptation for Incremental SVM Classification of Drift Data

A common assumption in machine learning is that training data is complete, and the data distribution is fixed. However, in many practical applications, this assumption does not hold. Incremental learning was proposed to compensate for this problem. Common approaches include retraining models and inc...

Full description

Bibliographic Details
Main Authors:	Junya Tang, Kuo-Yi Lin, Li Li
Format:	Article
Language:	English
Published:	MDPI AG 2022-09-01
Series:	Mathematics
Subjects:	incremental learning domain adaptation SVM classification ensemble learning
Online Access:	https://www.mdpi.com/2227-7390/10/19/3579

_version_	1797478101989457920
author	Junya Tang Kuo-Yi Lin Li Li
author_facet	Junya Tang Kuo-Yi Lin Li Li
author_sort	Junya Tang
collection	DOAJ
description	A common assumption in machine learning is that training data is complete, and the data distribution is fixed. However, in many practical applications, this assumption does not hold. Incremental learning was proposed to compensate for this problem. Common approaches include retraining models and incremental learning to compensate for the shortage of training data. Retraining models is time-consuming and computationally expensive, while incremental learning can save time and computational costs. However, the concept drift may affect the performance. Two crucial issues should be considered to address concept drift in incremental learning: gaining new knowledge without forgetting previously acquired knowledge and forgetting obsolete information without corrupting valid information. This paper proposes an incremental support vector machine learning approach with domain adaptation, considering both crucial issues. Firstly, a small amount of new data is used to fine-tune the previous model to generate a model that is sensitive to the new data but retains the previous data information by transferring parameters. Secondly, an ensemble and model selection mechanism based on Bayesian theory is proposed to keep the valid information. The computational experiments indicate that the performance of the proposed model improved as new data was acquired. In addition, the influence of the degree of data drift on the algorithm is also explored. A gain in performance on four out of five industrial datasets and four synthetic datasets has been demonstrated over the support vector machine and incremental support vector machine algorithms.
first_indexed	2024-03-09T21:27:15Z
format	Article
id	doaj.art-210b5f14bd6a436bb9d9e156c0a8bf9c
institution	Directory Open Access Journal
issn	2227-7390
language	English
last_indexed	2024-03-09T21:27:15Z
publishDate	2022-09-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj.art-210b5f14bd6a436bb9d9e156c0a8bf9c2023-11-23T21:03:56ZengMDPI AGMathematics2227-73902022-09-011019357910.3390/math10193579Using Domain Adaptation for Incremental SVM Classification of Drift DataJunya Tang0Kuo-Yi Lin1Li Li2School of Electronics and Information Engineering, Tongji University, Shanghai 201804, ChinaSchool of Electronics and Information Engineering, Tongji University, Shanghai 201804, ChinaSchool of Electronics and Information Engineering, Tongji University, Shanghai 201804, ChinaA common assumption in machine learning is that training data is complete, and the data distribution is fixed. However, in many practical applications, this assumption does not hold. Incremental learning was proposed to compensate for this problem. Common approaches include retraining models and incremental learning to compensate for the shortage of training data. Retraining models is time-consuming and computationally expensive, while incremental learning can save time and computational costs. However, the concept drift may affect the performance. Two crucial issues should be considered to address concept drift in incremental learning: gaining new knowledge without forgetting previously acquired knowledge and forgetting obsolete information without corrupting valid information. This paper proposes an incremental support vector machine learning approach with domain adaptation, considering both crucial issues. Firstly, a small amount of new data is used to fine-tune the previous model to generate a model that is sensitive to the new data but retains the previous data information by transferring parameters. Secondly, an ensemble and model selection mechanism based on Bayesian theory is proposed to keep the valid information. The computational experiments indicate that the performance of the proposed model improved as new data was acquired. In addition, the influence of the degree of data drift on the algorithm is also explored. A gain in performance on four out of five industrial datasets and four synthetic datasets has been demonstrated over the support vector machine and incremental support vector machine algorithms.https://www.mdpi.com/2227-7390/10/19/3579incremental learningdomain adaptationSVM classificationensemble learning
spellingShingle	Junya Tang Kuo-Yi Lin Li Li Using Domain Adaptation for Incremental SVM Classification of Drift Data Mathematics incremental learning domain adaptation SVM classification ensemble learning
title	Using Domain Adaptation for Incremental SVM Classification of Drift Data
title_full	Using Domain Adaptation for Incremental SVM Classification of Drift Data
title_fullStr	Using Domain Adaptation for Incremental SVM Classification of Drift Data
title_full_unstemmed	Using Domain Adaptation for Incremental SVM Classification of Drift Data
title_short	Using Domain Adaptation for Incremental SVM Classification of Drift Data
title_sort	using domain adaptation for incremental svm classification of drift data
topic	incremental learning domain adaptation SVM classification ensemble learning
url	https://www.mdpi.com/2227-7390/10/19/3579
work_keys_str_mv	AT junyatang usingdomainadaptationforincrementalsvmclassificationofdriftdata AT kuoyilin usingdomainadaptationforincrementalsvmclassificationofdriftdata AT lili usingdomainadaptationforincrementalsvmclassificationofdriftdata

Using Domain Adaptation for Incremental SVM Classification of Drift Data

Similar Items