Gaussian Mixture Variational Autoencoder for Semi-Supervised Topic Modeling

Topic models are widely explored for summarizing a corpus of documents. Recent advances in Variational AutoEncoder (VAE) have enabled the development of black-box inference methods for topic modeling in order to alleviate the drawbacks of classical statistical inference. Most existing VAE based appr...

Full description

Bibliographic Details
Main Authors:	Cangqi Zhou, Hao Ban, Jing Zhang, Qianmu Li, Yinghua Zhang
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Topic model variational autoencoder semi-supervised learning Gaussian mixture model deep generative learning
Online Access:	https://ieeexplore.ieee.org/document/9112154/

_version_	1818616951384047616
author	Cangqi Zhou Hao Ban Jing Zhang Qianmu Li Yinghua Zhang
author_facet	Cangqi Zhou Hao Ban Jing Zhang Qianmu Li Yinghua Zhang
author_sort	Cangqi Zhou
collection	DOAJ
description	Topic models are widely explored for summarizing a corpus of documents. Recent advances in Variational AutoEncoder (VAE) have enabled the development of black-box inference methods for topic modeling in order to alleviate the drawbacks of classical statistical inference. Most existing VAE based approaches assume a unimodal Gaussian distribution for the approximate posterior of latent variables, which limits the flexibility in encoding the latent space. In addition, the unsupervised architecture hinders the incorporation of extra label information, which is ubiquitous in many applications. In this paper, we propose a semi-supervised topic model under the VAE framework. We assume that a document is modeled as a mixture of classes, and a class is modeled as a mixture of latent topics. A multimodal Gaussian mixture model is adopted for latent space. The parameters of the components and the mixing weights are encoded separately. These weights, together with partially labeled data, also contribute to the training of a classifier. The objective is derived under the Gaussian mixture assumption and the semi-supervised VAE framework. Modules of the proposed framework are appropriately designated. Experiments performed on three benchmark datasets demonstrate the effectiveness of our method, comparing to several competitive baselines.
first_indexed	2024-12-16T16:57:57Z
format	Article
id	doaj.art-7e478776d5ef40bf95e175c244396031
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-16T16:57:57Z
publishDate	2020-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-7e478776d5ef40bf95e175c2443960312022-12-21T22:23:49ZengIEEEIEEE Access2169-35362020-01-01810684310685410.1109/ACCESS.2020.30011849112154Gaussian Mixture Variational Autoencoder for Semi-Supervised Topic ModelingCangqi Zhou0https://orcid.org/0000-0003-0528-8202Hao Ban1https://orcid.org/0000-0003-0724-2857Jing Zhang2https://orcid.org/0000-0003-2541-4923Qianmu Li3https://orcid.org/0000-0002-0998-1517Yinghua Zhang4https://orcid.org/0000-0003-0324-4812School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, ChinaSchool of Information Science and Engineering, Southeast University, Nanjing, ChinaSchool of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, ChinaSchool of Cyber Science and Engineering, Nanjing University of Science and Technology, Nanjing, ChinaSenseDeal Intelligent Technology Company Ltd., Beijing, ChinaTopic models are widely explored for summarizing a corpus of documents. Recent advances in Variational AutoEncoder (VAE) have enabled the development of black-box inference methods for topic modeling in order to alleviate the drawbacks of classical statistical inference. Most existing VAE based approaches assume a unimodal Gaussian distribution for the approximate posterior of latent variables, which limits the flexibility in encoding the latent space. In addition, the unsupervised architecture hinders the incorporation of extra label information, which is ubiquitous in many applications. In this paper, we propose a semi-supervised topic model under the VAE framework. We assume that a document is modeled as a mixture of classes, and a class is modeled as a mixture of latent topics. A multimodal Gaussian mixture model is adopted for latent space. The parameters of the components and the mixing weights are encoded separately. These weights, together with partially labeled data, also contribute to the training of a classifier. The objective is derived under the Gaussian mixture assumption and the semi-supervised VAE framework. Modules of the proposed framework are appropriately designated. Experiments performed on three benchmark datasets demonstrate the effectiveness of our method, comparing to several competitive baselines.https://ieeexplore.ieee.org/document/9112154/Topic modelvariational autoencodersemi-supervised learningGaussian mixture modeldeep generative learning
spellingShingle	Cangqi Zhou Hao Ban Jing Zhang Qianmu Li Yinghua Zhang Gaussian Mixture Variational Autoencoder for Semi-Supervised Topic Modeling IEEE Access Topic model variational autoencoder semi-supervised learning Gaussian mixture model deep generative learning
title	Gaussian Mixture Variational Autoencoder for Semi-Supervised Topic Modeling
title_full	Gaussian Mixture Variational Autoencoder for Semi-Supervised Topic Modeling
title_fullStr	Gaussian Mixture Variational Autoencoder for Semi-Supervised Topic Modeling
title_full_unstemmed	Gaussian Mixture Variational Autoencoder for Semi-Supervised Topic Modeling
title_short	Gaussian Mixture Variational Autoencoder for Semi-Supervised Topic Modeling
title_sort	gaussian mixture variational autoencoder for semi supervised topic modeling
topic	Topic model variational autoencoder semi-supervised learning Gaussian mixture model deep generative learning
url	https://ieeexplore.ieee.org/document/9112154/
work_keys_str_mv	AT cangqizhou gaussianmixturevariationalautoencoderforsemisupervisedtopicmodeling AT haoban gaussianmixturevariationalautoencoderforsemisupervisedtopicmodeling AT jingzhang gaussianmixturevariationalautoencoderforsemisupervisedtopicmodeling AT qianmuli gaussianmixturevariationalautoencoderforsemisupervisedtopicmodeling AT yinghuazhang gaussianmixturevariationalautoencoderforsemisupervisedtopicmodeling

Gaussian Mixture Variational Autoencoder for Semi-Supervised Topic Modeling

Similar Items