Estimating Number of Topics in Topic Modeling on Persian Research Articles
This article presents a method to find the number of topics in Persian research articles, which is actually one of the main challenges in topic modeling. It is the process of automatically recognizing topics in a text with the aim of discovering hidden patterns. This study has estimated the number o...
Main Author: | |
---|---|
Format: | Article |
Language: | fas |
Published: |
Iranian Research Institute for Information and Technology
2023-07-01
|
Series: | Iranian Journal of Information Processing & Management |
Subjects: | |
Online Access: | https://jipm.irandoc.ac.ir/article_701394_e29faecb44fc8c928f4560e9909d8164.pdf |
_version_ | 1827768429090701312 |
---|---|
author | نیلوفر مظفری |
author_facet | نیلوفر مظفری |
author_sort | نیلوفر مظفری |
collection | DOAJ |
description | This article presents a method to find the number of topics in Persian research articles, which is actually one of the main challenges in topic modeling. It is the process of automatically recognizing topics in a text with the aim of discovering hidden patterns.
This study has estimated the number of topics for Persian research articles using two approaches. The first is based on the greedy search and later uses Renormalization theory, which is a mathematical formalism to construct a procedure for changing the scale of the system so that the behavior of the system preserves. Also, the execution time of both algorithms on Persian academic articles has been compared with each other.
The findings indicate that the renormalization approach predicts the number of topics in Persian research articles with the lower time complexity in comparison to the greedy based approach.
The approach based on Renormalization has high efficiency for estimating the number of topics in Persian academic articles. |
first_indexed | 2024-03-11T12:12:02Z |
format | Article |
id | doaj.art-b287d6ae9ddf40f2bc4ff55ae16a472f |
institution | Directory Open Access Journal |
issn | 2251-8223 2251-8231 |
language | fas |
last_indexed | 2024-03-11T12:12:02Z |
publishDate | 2023-07-01 |
publisher | Iranian Research Institute for Information and Technology |
record_format | Article |
series | Iranian Journal of Information Processing & Management |
spelling | doaj.art-b287d6ae9ddf40f2bc4ff55ae16a472f2023-11-07T10:53:40ZfasIranian Research Institute for Information and TechnologyIranian Journal of Information Processing & Management2251-82232251-82312023-07-013841345136810.22034/jipm.2023.701394701394Estimating Number of Topics in Topic Modeling on Persian Research Articlesنیلوفر مظفری0مرکز منطقه ای اطلاع رسانی علوم و فناوریThis article presents a method to find the number of topics in Persian research articles, which is actually one of the main challenges in topic modeling. It is the process of automatically recognizing topics in a text with the aim of discovering hidden patterns. This study has estimated the number of topics for Persian research articles using two approaches. The first is based on the greedy search and later uses Renormalization theory, which is a mathematical formalism to construct a procedure for changing the scale of the system so that the behavior of the system preserves. Also, the execution time of both algorithms on Persian academic articles has been compared with each other. The findings indicate that the renormalization approach predicts the number of topics in Persian research articles with the lower time complexity in comparison to the greedy based approach. The approach based on Renormalization has high efficiency for estimating the number of topics in Persian academic articles.https://jipm.irandoc.ac.ir/article_701394_e29faecb44fc8c928f4560e9909d8164.pdfrenormalization theoryrényi entropygrid search. latent dirichlet allocation |
spellingShingle | نیلوفر مظفری Estimating Number of Topics in Topic Modeling on Persian Research Articles Iranian Journal of Information Processing & Management renormalization theory rényi entropy grid search. latent dirichlet allocation |
title | Estimating Number of Topics in Topic Modeling on Persian Research Articles |
title_full | Estimating Number of Topics in Topic Modeling on Persian Research Articles |
title_fullStr | Estimating Number of Topics in Topic Modeling on Persian Research Articles |
title_full_unstemmed | Estimating Number of Topics in Topic Modeling on Persian Research Articles |
title_short | Estimating Number of Topics in Topic Modeling on Persian Research Articles |
title_sort | estimating number of topics in topic modeling on persian research articles |
topic | renormalization theory rényi entropy grid search. latent dirichlet allocation |
url | https://jipm.irandoc.ac.ir/article_701394_e29faecb44fc8c928f4560e9909d8164.pdf |
work_keys_str_mv | AT nylwfrmẓfry estimatingnumberoftopicsintopicmodelingonpersianresearcharticles |