Enhancing the Generalization for Text Classification through Fusion of Backward Features

Generalization has always been a keyword in deep learning. Pretrained models and domain adaptation technology have received widespread attention in solving the problem of generalization. They are all focused on finding features in data to improve the generalization ability and to prevent overfitting...

Full description

Bibliographic Details
Main Authors:	Dewen Seng, Xin Wu
Format:	Article
Language:	English
Published:	MDPI AG 2023-01-01
Series:	Sensors
Subjects:	deep learning text classification two-stream networks feature fusion sentiment classification sarcasm detection
Online Access:	https://www.mdpi.com/1424-8220/23/3/1287

_version_	1827759503236399104
author	Dewen Seng Xin Wu
author_facet	Dewen Seng Xin Wu
author_sort	Dewen Seng
collection	DOAJ
description	Generalization has always been a keyword in deep learning. Pretrained models and domain adaptation technology have received widespread attention in solving the problem of generalization. They are all focused on finding features in data to improve the generalization ability and to prevent overfitting. Although they have achieved good results in various tasks, those models are unstable when classifying a sentence whose label is positive but still contains negative phrases. In this article, we analyzed the attention heat map of the benchmarks and found that previous models pay more attention to the phrase rather than to the semantic information of the whole sentence. Moreover, we proposed a method to scatter the attention away from opposite sentiment words to avoid a one-sided judgment. We designed a two-stream network and stacked the gradient reversal layer and feature projection layer within the auxiliary network. The gradient reversal layer can reverse the gradient of features in the training stage so that the parameters are optimized following the reversed gradient in the backpropagation stage. We utilized an auxiliary network to extract the backward features and then fed them into the main network to merge them with normal features extracted by the main network. We applied this method to the three baselines of TextCNN, BERT, and RoBERTa using sentiment analysis and sarcasm detection datasets. The results show that our method can improve the sentiment analysis datasets by 0.5% and the sarcasm detection datasets by 2.1%.
first_indexed	2024-03-11T09:26:22Z
format	Article
id	doaj.art-aa118c2a59f344d1a14a3a96b74314dd
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-11T09:26:22Z
publishDate	2023-01-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-aa118c2a59f344d1a14a3a96b74314dd2023-11-16T17:58:52ZengMDPI AGSensors1424-82202023-01-01233128710.3390/s23031287Enhancing the Generalization for Text Classification through Fusion of Backward FeaturesDewen Seng0Xin Wu1School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310005, ChinaSchool of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310005, ChinaGeneralization has always been a keyword in deep learning. Pretrained models and domain adaptation technology have received widespread attention in solving the problem of generalization. They are all focused on finding features in data to improve the generalization ability and to prevent overfitting. Although they have achieved good results in various tasks, those models are unstable when classifying a sentence whose label is positive but still contains negative phrases. In this article, we analyzed the attention heat map of the benchmarks and found that previous models pay more attention to the phrase rather than to the semantic information of the whole sentence. Moreover, we proposed a method to scatter the attention away from opposite sentiment words to avoid a one-sided judgment. We designed a two-stream network and stacked the gradient reversal layer and feature projection layer within the auxiliary network. The gradient reversal layer can reverse the gradient of features in the training stage so that the parameters are optimized following the reversed gradient in the backpropagation stage. We utilized an auxiliary network to extract the backward features and then fed them into the main network to merge them with normal features extracted by the main network. We applied this method to the three baselines of TextCNN, BERT, and RoBERTa using sentiment analysis and sarcasm detection datasets. The results show that our method can improve the sentiment analysis datasets by 0.5% and the sarcasm detection datasets by 2.1%.https://www.mdpi.com/1424-8220/23/3/1287deep learningtext classificationtwo-stream networksfeature fusionsentiment classificationsarcasm detection
spellingShingle	Dewen Seng Xin Wu Enhancing the Generalization for Text Classification through Fusion of Backward Features Sensors deep learning text classification two-stream networks feature fusion sentiment classification sarcasm detection
title	Enhancing the Generalization for Text Classification through Fusion of Backward Features
title_full	Enhancing the Generalization for Text Classification through Fusion of Backward Features
title_fullStr	Enhancing the Generalization for Text Classification through Fusion of Backward Features
title_full_unstemmed	Enhancing the Generalization for Text Classification through Fusion of Backward Features
title_short	Enhancing the Generalization for Text Classification through Fusion of Backward Features
title_sort	enhancing the generalization for text classification through fusion of backward features
topic	deep learning text classification two-stream networks feature fusion sentiment classification sarcasm detection
url	https://www.mdpi.com/1424-8220/23/3/1287
work_keys_str_mv	AT dewenseng enhancingthegeneralizationfortextclassificationthroughfusionofbackwardfeatures AT xinwu enhancingthegeneralizationfortextclassificationthroughfusionofbackwardfeatures

Enhancing the Generalization for Text Classification through Fusion of Backward Features

Similar Items