An innovative automatic indexing method for Arabic text
<p>Automatic indexing and text retrieval methods for languages have been studied for a long time. Automatic indexing is a process of extracting words from a document to classify the documents per subject and to enhance the information retrieval process. Compared to other languages, there is st...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Academy Publishing Center
2023-03-01
|
Series: | Advances in Computing and Engineering |
Subjects: | |
Online Access: | http://apc.aast.edu/ojs/index.php/ACE/article/view/557 |
_version_ | 1797258896264396800 |
---|---|
author | Ramzi A. Haraty Sanaa Kaddoura Sultan Al Jahdali Nour K. Masri |
author_facet | Ramzi A. Haraty Sanaa Kaddoura Sultan Al Jahdali Nour K. Masri |
author_sort | Ramzi A. Haraty |
collection | DOAJ |
description | <p>Automatic indexing and text retrieval methods for languages have been studied for a long time. Automatic indexing is a process of extracting words from a document to classify the documents per subject and to enhance the information retrieval process. Compared to other languages, there is still limited research conducted for automated Arabic text categorization. In this work, we present an innovative method to reinforce the accuracy of automatic indexing of Arabic texts by introducing and integrating a thesaurus. Our model extracts new relevant words by referring to the created thesaurus, which contains and identifies words, synonyms, and correlations. This thesaurus is built using a natural language toolkit, which contains a library that lists the synonyms of a particular word available in the WordNet library. The words that have the same meaning and frequently appear together are grouped under one umbrella using a JavaScript Object Notation dictionary, making it leisurely to identify the topic of the text. Our results exhibit notable improvement in accuracy and efficiency compared to previous works.</p> |
first_indexed | 2024-03-12T01:40:34Z |
format | Article |
id | doaj.art-ba8e1ad20c5e48d986b34d251ab2aa02 |
institution | Directory Open Access Journal |
issn | 2735-5977 2735-5985 |
language | English |
last_indexed | 2024-04-24T23:00:49Z |
publishDate | 2023-03-01 |
publisher | Academy Publishing Center |
record_format | Article |
series | Advances in Computing and Engineering |
spelling | doaj.art-ba8e1ad20c5e48d986b34d251ab2aa022024-03-17T15:34:15ZengAcademy Publishing CenterAdvances in Computing and Engineering2735-59772735-59852023-03-0131012310.21622/ACE.2023.03.1.001269An innovative automatic indexing method for Arabic textRamzi A. Haraty0Sanaa KaddouraSultan Al JahdaliNour K. MasriLebanese American University<p>Automatic indexing and text retrieval methods for languages have been studied for a long time. Automatic indexing is a process of extracting words from a document to classify the documents per subject and to enhance the information retrieval process. Compared to other languages, there is still limited research conducted for automated Arabic text categorization. In this work, we present an innovative method to reinforce the accuracy of automatic indexing of Arabic texts by introducing and integrating a thesaurus. Our model extracts new relevant words by referring to the created thesaurus, which contains and identifies words, synonyms, and correlations. This thesaurus is built using a natural language toolkit, which contains a library that lists the synonyms of a particular word available in the WordNet library. The words that have the same meaning and frequently appear together are grouped under one umbrella using a JavaScript Object Notation dictionary, making it leisurely to identify the topic of the text. Our results exhibit notable improvement in accuracy and efficiency compared to previous works.</p>http://apc.aast.edu/ojs/index.php/ACE/article/view/557arabic text, automatic indexing, building thesaurus, frequent sets, json dictionary, synonyms |
spellingShingle | Ramzi A. Haraty Sanaa Kaddoura Sultan Al Jahdali Nour K. Masri An innovative automatic indexing method for Arabic text Advances in Computing and Engineering arabic text, automatic indexing, building thesaurus, frequent sets, json dictionary, synonyms |
title | An innovative automatic indexing method for Arabic text |
title_full | An innovative automatic indexing method for Arabic text |
title_fullStr | An innovative automatic indexing method for Arabic text |
title_full_unstemmed | An innovative automatic indexing method for Arabic text |
title_short | An innovative automatic indexing method for Arabic text |
title_sort | innovative automatic indexing method for arabic text |
topic | arabic text, automatic indexing, building thesaurus, frequent sets, json dictionary, synonyms |
url | http://apc.aast.edu/ojs/index.php/ACE/article/view/557 |
work_keys_str_mv | AT ramziaharaty aninnovativeautomaticindexingmethodforarabictext AT sanaakaddoura aninnovativeautomaticindexingmethodforarabictext AT sultanaljahdali aninnovativeautomaticindexingmethodforarabictext AT nourkmasri aninnovativeautomaticindexingmethodforarabictext AT ramziaharaty innovativeautomaticindexingmethodforarabictext AT sanaakaddoura innovativeautomaticindexingmethodforarabictext AT sultanaljahdali innovativeautomaticindexingmethodforarabictext AT nourkmasri innovativeautomaticindexingmethodforarabictext |