Informative Language Encoding by Variational Autoencoders Using Transformer
In natural language processing (NLP), Transformer is widely used and has reached the state-of-the-art level in numerous NLP tasks such as language modeling, summarization, and classification. Moreover, a variational autoencoder (VAE) is an efficient generative model in representation learning, combi...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-08-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/12/16/7968 |
_version_ | 1827623958407544832 |
---|---|
author | Changwon Ok Geonseok Lee Kichun Lee |
author_facet | Changwon Ok Geonseok Lee Kichun Lee |
author_sort | Changwon Ok |
collection | DOAJ |
description | In natural language processing (NLP), Transformer is widely used and has reached the state-of-the-art level in numerous NLP tasks such as language modeling, summarization, and classification. Moreover, a variational autoencoder (VAE) is an efficient generative model in representation learning, combining deep learning with statistical inference in encoded representations. However, the use of VAE in natural language processing often brings forth practical difficulties such as a posterior collapse, also known as Kullback–Leibler (KL) vanishing. To mitigate this problem, while taking advantage of the parallelization of language data processing, we propose a new language representation model as the integration of two seemingly different deep learning models, which is a Transformer model solely coupled with a variational autoencoder. We compare the proposed model with previous works, such as a VAE connected with a recurrent neural network (RNN). Our experiments with four real-life datasets show that implementation with KL annealing mitigates posterior collapses. The results also show that the proposed Transformer model outperforms RNN-based models in reconstruction and representation learning, and that the encoded representations of the proposed model are more informative than other tested models. |
first_indexed | 2024-03-09T11:57:58Z |
format | Article |
id | doaj.art-d28060e207ec43dd9bab71b3839987ad |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-09T11:57:58Z |
publishDate | 2022-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-d28060e207ec43dd9bab71b3839987ad2023-11-30T23:07:00ZengMDPI AGApplied Sciences2076-34172022-08-011216796810.3390/app12167968Informative Language Encoding by Variational Autoencoders Using TransformerChangwon Ok0Geonseok Lee1Kichun Lee2KT Corporation, Seongnam 13606, KoreaDepartment of Industrial Engineering, Hanyang University, Seoul 04763, KoreaDepartment of Industrial Engineering, Hanyang University, Seoul 04763, KoreaIn natural language processing (NLP), Transformer is widely used and has reached the state-of-the-art level in numerous NLP tasks such as language modeling, summarization, and classification. Moreover, a variational autoencoder (VAE) is an efficient generative model in representation learning, combining deep learning with statistical inference in encoded representations. However, the use of VAE in natural language processing often brings forth practical difficulties such as a posterior collapse, also known as Kullback–Leibler (KL) vanishing. To mitigate this problem, while taking advantage of the parallelization of language data processing, we propose a new language representation model as the integration of two seemingly different deep learning models, which is a Transformer model solely coupled with a variational autoencoder. We compare the proposed model with previous works, such as a VAE connected with a recurrent neural network (RNN). Our experiments with four real-life datasets show that implementation with KL annealing mitigates posterior collapses. The results also show that the proposed Transformer model outperforms RNN-based models in reconstruction and representation learning, and that the encoded representations of the proposed model are more informative than other tested models.https://www.mdpi.com/2076-3417/12/16/7968natural language processingtransformervariational autoencodertext mining |
spellingShingle | Changwon Ok Geonseok Lee Kichun Lee Informative Language Encoding by Variational Autoencoders Using Transformer Applied Sciences natural language processing transformer variational autoencoder text mining |
title | Informative Language Encoding by Variational Autoencoders Using Transformer |
title_full | Informative Language Encoding by Variational Autoencoders Using Transformer |
title_fullStr | Informative Language Encoding by Variational Autoencoders Using Transformer |
title_full_unstemmed | Informative Language Encoding by Variational Autoencoders Using Transformer |
title_short | Informative Language Encoding by Variational Autoencoders Using Transformer |
title_sort | informative language encoding by variational autoencoders using transformer |
topic | natural language processing transformer variational autoencoder text mining |
url | https://www.mdpi.com/2076-3417/12/16/7968 |
work_keys_str_mv | AT changwonok informativelanguageencodingbyvariationalautoencodersusingtransformer AT geonseoklee informativelanguageencodingbyvariationalautoencodersusingtransformer AT kichunlee informativelanguageencodingbyvariationalautoencodersusingtransformer |