Informative Language Encoding by Variational Autoencoders Using Transformer

In natural language processing (NLP), Transformer is widely used and has reached the state-of-the-art level in numerous NLP tasks such as language modeling, summarization, and classification. Moreover, a variational autoencoder (VAE) is an efficient generative model in representation learning, combi...

Full description

Bibliographic Details
Main Authors:	Changwon Ok, Geonseok Lee, Kichun Lee
Format:	Article
Language:	English
Published:	MDPI AG 2022-08-01
Series:	Applied Sciences
Subjects:	natural language processing transformer variational autoencoder text mining
Online Access:	https://www.mdpi.com/2076-3417/12/16/7968

_version_	1827623958407544832
author	Changwon Ok Geonseok Lee Kichun Lee
author_facet	Changwon Ok Geonseok Lee Kichun Lee
author_sort	Changwon Ok
collection	DOAJ
description	In natural language processing (NLP), Transformer is widely used and has reached the state-of-the-art level in numerous NLP tasks such as language modeling, summarization, and classification. Moreover, a variational autoencoder (VAE) is an efficient generative model in representation learning, combining deep learning with statistical inference in encoded representations. However, the use of VAE in natural language processing often brings forth practical difficulties such as a posterior collapse, also known as Kullback–Leibler (KL) vanishing. To mitigate this problem, while taking advantage of the parallelization of language data processing, we propose a new language representation model as the integration of two seemingly different deep learning models, which is a Transformer model solely coupled with a variational autoencoder. We compare the proposed model with previous works, such as a VAE connected with a recurrent neural network (RNN). Our experiments with four real-life datasets show that implementation with KL annealing mitigates posterior collapses. The results also show that the proposed Transformer model outperforms RNN-based models in reconstruction and representation learning, and that the encoded representations of the proposed model are more informative than other tested models.
first_indexed	2024-03-09T11:57:58Z
format	Article
id	doaj.art-d28060e207ec43dd9bab71b3839987ad
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-09T11:57:58Z
publishDate	2022-08-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-d28060e207ec43dd9bab71b3839987ad2023-11-30T23:07:00ZengMDPI AGApplied Sciences2076-34172022-08-011216796810.3390/app12167968Informative Language Encoding by Variational Autoencoders Using TransformerChangwon Ok0Geonseok Lee1Kichun Lee2KT Corporation, Seongnam 13606, KoreaDepartment of Industrial Engineering, Hanyang University, Seoul 04763, KoreaDepartment of Industrial Engineering, Hanyang University, Seoul 04763, KoreaIn natural language processing (NLP), Transformer is widely used and has reached the state-of-the-art level in numerous NLP tasks such as language modeling, summarization, and classification. Moreover, a variational autoencoder (VAE) is an efficient generative model in representation learning, combining deep learning with statistical inference in encoded representations. However, the use of VAE in natural language processing often brings forth practical difficulties such as a posterior collapse, also known as Kullback–Leibler (KL) vanishing. To mitigate this problem, while taking advantage of the parallelization of language data processing, we propose a new language representation model as the integration of two seemingly different deep learning models, which is a Transformer model solely coupled with a variational autoencoder. We compare the proposed model with previous works, such as a VAE connected with a recurrent neural network (RNN). Our experiments with four real-life datasets show that implementation with KL annealing mitigates posterior collapses. The results also show that the proposed Transformer model outperforms RNN-based models in reconstruction and representation learning, and that the encoded representations of the proposed model are more informative than other tested models.https://www.mdpi.com/2076-3417/12/16/7968natural language processingtransformervariational autoencodertext mining
spellingShingle	Changwon Ok Geonseok Lee Kichun Lee Informative Language Encoding by Variational Autoencoders Using Transformer Applied Sciences natural language processing transformer variational autoencoder text mining
title	Informative Language Encoding by Variational Autoencoders Using Transformer
title_full	Informative Language Encoding by Variational Autoencoders Using Transformer
title_fullStr	Informative Language Encoding by Variational Autoencoders Using Transformer
title_full_unstemmed	Informative Language Encoding by Variational Autoencoders Using Transformer
title_short	Informative Language Encoding by Variational Autoencoders Using Transformer
title_sort	informative language encoding by variational autoencoders using transformer
topic	natural language processing transformer variational autoencoder text mining
url	https://www.mdpi.com/2076-3417/12/16/7968
work_keys_str_mv	AT changwonok informativelanguageencodingbyvariationalautoencodersusingtransformer AT geonseoklee informativelanguageencodingbyvariationalautoencodersusingtransformer AT kichunlee informativelanguageencodingbyvariationalautoencodersusingtransformer

Informative Language Encoding by Variational Autoencoders Using Transformer

Similar Items