Self-Supervised and Few-Shot Contrastive Learning Frameworks for Text Clustering

Contrastive learning is a promising approach to unsupervised learning, as it inherits the advantages of well-studied deep models without a dedicated and complex model design. In this paper, based on bidirectional encoder representations from transformers (BERT) and long-short term memory (LSTM) neur...

Full description

Bibliographic Details
Main Authors:	Haoxiang Shi, Tetsuya Sakai
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Clustering methods data augmentation self-supervised learning semisupervised learning
Online Access:	https://ieeexplore.ieee.org/document/10210342/

Description
Summary:	Contrastive learning is a promising approach to unsupervised learning, as it inherits the advantages of well-studied deep models without a dedicated and complex model design. In this paper, based on bidirectional encoder representations from transformers (BERT) and long-short term memory (LSTM) neural networks, we propose self-supervised contrastive learning (SCL) as well as few-shot contrastive learning (FCL) with unsupervised data augmentation (UDA) for text clustering. BERT-SCL outperforms state-of-the-art unsupervised clustering approaches for short texts and for long texts in terms of several clustering evaluation measures. LSTM-SCL also shows good performance for short text clustering. BERT-FCL achieves performance close to supervised learning, and BERT-FCL with UDA further improves the performance for short texts. LSTM-FCL outperforms the supervised model in terms of several clustering evaluation measures. Our experiment results suggest that both SCL and FCL are effective for text clustering.
ISSN:	2169-3536

Self-Supervised and Few-Shot Contrastive Learning Frameworks for Text Clustering

Similar Items