Estimating Number of Topics in Topic Modeling on Persian Research Articles

This article presents a method to find the number of topics in Persian research articles, which is actually one of the main challenges in topic modeling. It is the process of automatically recognizing topics in a text with the aim of discovering hidden patterns. This study has estimated the number o...

Full description

Bibliographic Details
Main Author: نیلوفر مظفری
Format: Article
Language:fas
Published: Iranian Research Institute for Information and Technology 2023-07-01
Series:Iranian Journal of Information Processing & Management
Subjects:
Online Access:https://jipm.irandoc.ac.ir/article_701394_e29faecb44fc8c928f4560e9909d8164.pdf
_version_ 1827768429090701312
author نیلوفر مظفری
author_facet نیلوفر مظفری
author_sort نیلوفر مظفری
collection DOAJ
description This article presents a method to find the number of topics in Persian research articles, which is actually one of the main challenges in topic modeling. It is the process of automatically recognizing topics in a text with the aim of discovering hidden patterns. This study has estimated the number of topics for Persian research articles using two approaches. The first is based on the greedy search and later uses Renormalization theory, which is a mathematical formalism to construct a procedure for changing the scale of the system so that the behavior of the system preserves. Also, the execution time of both algorithms on Persian academic articles has been compared with each other. The findings indicate that the renormalization approach predicts the number of topics in Persian research articles with the lower time complexity in comparison to the greedy based approach. The approach based on Renormalization has high efficiency for estimating the number of topics in Persian academic articles.
first_indexed 2024-03-11T12:12:02Z
format Article
id doaj.art-b287d6ae9ddf40f2bc4ff55ae16a472f
institution Directory Open Access Journal
issn 2251-8223
2251-8231
language fas
last_indexed 2024-03-11T12:12:02Z
publishDate 2023-07-01
publisher Iranian Research Institute for Information and Technology
record_format Article
series Iranian Journal of Information Processing & Management
spelling doaj.art-b287d6ae9ddf40f2bc4ff55ae16a472f2023-11-07T10:53:40ZfasIranian Research Institute for Information and TechnologyIranian Journal of Information Processing & Management2251-82232251-82312023-07-013841345136810.22034/jipm.2023.701394701394Estimating Number of Topics in Topic Modeling on Persian Research Articlesنیلوفر مظفری0مرکز منطقه ای اطلاع رسانی علوم و فناوریThis article presents a method to find the number of topics in Persian research articles, which is actually one of the main challenges in topic modeling. It is the process of automatically recognizing topics in a text with the aim of discovering hidden patterns. This study has estimated the number of topics for Persian research articles using two approaches. The first is based on the greedy search and later uses Renormalization theory, which is a mathematical formalism to construct a procedure for changing the scale of the system so that the behavior of the system preserves. Also, the execution time of both algorithms on Persian academic articles has been compared with each other. The findings indicate that the renormalization approach predicts the number of topics in Persian research articles with the lower time complexity in comparison to the greedy based approach. The approach based on Renormalization has high efficiency for estimating the number of topics in Persian academic articles.https://jipm.irandoc.ac.ir/article_701394_e29faecb44fc8c928f4560e9909d8164.pdfrenormalization theoryrényi entropygrid search. latent dirichlet allocation
spellingShingle نیلوفر مظفری
Estimating Number of Topics in Topic Modeling on Persian Research Articles
Iranian Journal of Information Processing & Management
renormalization theory
rényi entropy
grid search. latent dirichlet allocation
title Estimating Number of Topics in Topic Modeling on Persian Research Articles
title_full Estimating Number of Topics in Topic Modeling on Persian Research Articles
title_fullStr Estimating Number of Topics in Topic Modeling on Persian Research Articles
title_full_unstemmed Estimating Number of Topics in Topic Modeling on Persian Research Articles
title_short Estimating Number of Topics in Topic Modeling on Persian Research Articles
title_sort estimating number of topics in topic modeling on persian research articles
topic renormalization theory
rényi entropy
grid search. latent dirichlet allocation
url https://jipm.irandoc.ac.ir/article_701394_e29faecb44fc8c928f4560e9909d8164.pdf
work_keys_str_mv AT nylwfrmẓfry estimatingnumberoftopicsintopicmodelingonpersianresearcharticles