Regularization, robustness and sparsity of probabilistic topic models
We propose a generalized probabilistic topic model of text corpora which can incorporate heuristics of Bayesian regularization, sampling, frequent parameters update, and robustness in any combinations. Wellknown models PLSA, LDA, CVB0, SWB, and many others can be considered as special cases of the p...
Main Authors: | , |
---|---|
Format: | Article |
Language: | Russian |
Published: |
Institute of Computer Science
2012-12-01
|
Series: | Компьютерные исследования и моделирование |
Subjects: | |
Online Access: | http://crm.ics.org.ru/uploads/crmissues/crm_2012_4/12403.pdf |
_version_ | 1818521239798415360 |
---|---|
author | Konstantin Vyacheslavovich Vorontsov Anna Alexandrovna Potapenko |
author_facet | Konstantin Vyacheslavovich Vorontsov Anna Alexandrovna Potapenko |
author_sort | Konstantin Vyacheslavovich Vorontsov |
collection | DOAJ |
description | We propose a generalized probabilistic topic model of text corpora which can incorporate heuristics of Bayesian regularization, sampling, frequent parameters update, and robustness in any combinations. Wellknown models PLSA, LDA, CVB0, SWB, and many others can be considered as special cases of the proposed broad family of models. We propose the robust PLSA model and show that it is more sparse and performs better that regularized models like LDA. |
first_indexed | 2024-12-11T01:48:24Z |
format | Article |
id | doaj.art-e08249df265942c8a5575feb34f6b1f5 |
institution | Directory Open Access Journal |
issn | 2076-7633 2077-6853 |
language | Russian |
last_indexed | 2024-12-11T01:48:24Z |
publishDate | 2012-12-01 |
publisher | Institute of Computer Science |
record_format | Article |
series | Компьютерные исследования и моделирование |
spelling | doaj.art-e08249df265942c8a5575feb34f6b1f52022-12-22T01:24:50ZrusInstitute of Computer ScienceКомпьютерные исследования и моделирование2076-76332077-68532012-12-014469370610.20537/2076-7633-2012-4-4-693-7061950Regularization, robustness and sparsity of probabilistic topic modelsKonstantin Vyacheslavovich VorontsovAnna Alexandrovna PotapenkoWe propose a generalized probabilistic topic model of text corpora which can incorporate heuristics of Bayesian regularization, sampling, frequent parameters update, and robustness in any combinations. Wellknown models PLSA, LDA, CVB0, SWB, and many others can be considered as special cases of the proposed broad family of models. We propose the robust PLSA model and show that it is more sparse and performs better that regularized models like LDA.http://crm.ics.org.ru/uploads/crmissues/crm_2012_4/12403.pdftext analysistopic modelingprobabilistic latent semantic analysisEM-algorithmlatent Dirichlet allocationGibbs samplingBayesian regularizationperplexityrobusteness |
spellingShingle | Konstantin Vyacheslavovich Vorontsov Anna Alexandrovna Potapenko Regularization, robustness and sparsity of probabilistic topic models Компьютерные исследования и моделирование text analysis topic modeling probabilistic latent semantic analysis EM-algorithm latent Dirichlet allocation Gibbs sampling Bayesian regularization perplexity robusteness |
title | Regularization, robustness and sparsity of probabilistic topic models |
title_full | Regularization, robustness and sparsity of probabilistic topic models |
title_fullStr | Regularization, robustness and sparsity of probabilistic topic models |
title_full_unstemmed | Regularization, robustness and sparsity of probabilistic topic models |
title_short | Regularization, robustness and sparsity of probabilistic topic models |
title_sort | regularization robustness and sparsity of probabilistic topic models |
topic | text analysis topic modeling probabilistic latent semantic analysis EM-algorithm latent Dirichlet allocation Gibbs sampling Bayesian regularization perplexity robusteness |
url | http://crm.ics.org.ru/uploads/crmissues/crm_2012_4/12403.pdf |
work_keys_str_mv | AT konstantinvyacheslavovichvorontsov regularizationrobustnessandsparsityofprobabilistictopicmodels AT annaalexandrovnapotapenko regularizationrobustnessandsparsityofprobabilistictopicmodels |