A SimCSE-based model for sentiment analysis in Chinese text messages

Existing sentiment analysis algorithms mainly focus on vectorized textual data representation and constructing high-quality deep learning classifiers. However, improving sentence embedding methods could enhance textual sentiment classification models. This project introduces a model for text-leve...

Disgrifiad llawn

Manylion Llyfryddiaeth
Prif Awdur: Song, Haiyang
Awduron Eraill: -
Fformat: Thesis-Master by Coursework
Iaith:English
Cyhoeddwyd: Nanyang Technological University 2024
Pynciau:
Mynediad Ar-lein:https://hdl.handle.net/10356/177139
Disgrifiad
Crynodeb:Existing sentiment analysis algorithms mainly focus on vectorized textual data representation and constructing high-quality deep learning classifiers. However, improving sentence embedding methods could enhance textual sentiment classification models. This project introduces a model for text-level sentiment classification utilizing contrastive learning and BERT pre-trained language models. Model combines SimCSE with self-supervised BERT training using contrastive learning. It adapts a simple text level sentiment analysis dataset into pairs through Back Translation, constructing siamese network BERTs. Each side of these BERTs shares the same structure and parameters. By feeding sentiment analysis text pairs generated through Back Translation into the BERT models, sentence representation vectors are obtained. The model optimizes by summing loss functions and back-propagating to improve performance. Finally, the onesided BERT network from the trained siamese network BERTs is applied to the supervised classification module for Chinese text sentiment classification. Experimental validation on three Chinese datasets, including Waimai 10k, chnsenticorp htl All, and online shopping 10 cats, demonstrates the effectiveness and superiority of the model over several cutting-edge text-level sentiment classification models. Keywords: Natural language processing, Sentiment Analysis; Contrastive Learning; Siamese Network.