An innovative automatic indexing method for Arabic text

<p>Automatic indexing and text retrieval methods for languages have been studied for a long time. Automatic indexing is a process of extracting words from a document to classify the documents per subject and to enhance the information retrieval process. Compared to other languages, there is st...

Full description

Bibliographic Details
Main Authors: Ramzi A. Haraty, Sanaa Kaddoura, Sultan Al Jahdali, Nour K. Masri
Format: Article
Language:English
Published: Academy Publishing Center 2023-03-01
Series:Advances in Computing and Engineering
Subjects:
Online Access:http://apc.aast.edu/ojs/index.php/ACE/article/view/557
_version_ 1797258896264396800
author Ramzi A. Haraty
Sanaa Kaddoura
Sultan Al Jahdali
Nour K. Masri
author_facet Ramzi A. Haraty
Sanaa Kaddoura
Sultan Al Jahdali
Nour K. Masri
author_sort Ramzi A. Haraty
collection DOAJ
description <p>Automatic indexing and text retrieval methods for languages have been studied for a long time. Automatic indexing is a process of extracting words from a document to classify the documents per subject and to enhance the information retrieval process. Compared to other languages, there is still limited research conducted for automated Arabic text categorization. In this work, we present an innovative method to reinforce the accuracy of automatic indexing of Arabic texts by introducing and integrating a thesaurus. Our model extracts new relevant words by referring to the created thesaurus, which contains and identifies words, synonyms, and correlations. This thesaurus is built using a natural language toolkit, which contains a library that lists the synonyms of a particular word available in the WordNet library. The words that have the same meaning and frequently appear together are grouped under one umbrella using a JavaScript Object Notation dictionary, making it leisurely to identify the topic of the text. Our results exhibit notable improvement in accuracy and efficiency compared to previous works.</p>
first_indexed 2024-03-12T01:40:34Z
format Article
id doaj.art-ba8e1ad20c5e48d986b34d251ab2aa02
institution Directory Open Access Journal
issn 2735-5977
2735-5985
language English
last_indexed 2024-04-24T23:00:49Z
publishDate 2023-03-01
publisher Academy Publishing Center
record_format Article
series Advances in Computing and Engineering
spelling doaj.art-ba8e1ad20c5e48d986b34d251ab2aa022024-03-17T15:34:15ZengAcademy Publishing CenterAdvances in Computing and Engineering2735-59772735-59852023-03-0131012310.21622/ACE.2023.03.1.001269An innovative automatic indexing method for Arabic textRamzi A. Haraty0Sanaa KaddouraSultan Al JahdaliNour K. MasriLebanese American University<p>Automatic indexing and text retrieval methods for languages have been studied for a long time. Automatic indexing is a process of extracting words from a document to classify the documents per subject and to enhance the information retrieval process. Compared to other languages, there is still limited research conducted for automated Arabic text categorization. In this work, we present an innovative method to reinforce the accuracy of automatic indexing of Arabic texts by introducing and integrating a thesaurus. Our model extracts new relevant words by referring to the created thesaurus, which contains and identifies words, synonyms, and correlations. This thesaurus is built using a natural language toolkit, which contains a library that lists the synonyms of a particular word available in the WordNet library. The words that have the same meaning and frequently appear together are grouped under one umbrella using a JavaScript Object Notation dictionary, making it leisurely to identify the topic of the text. Our results exhibit notable improvement in accuracy and efficiency compared to previous works.</p>http://apc.aast.edu/ojs/index.php/ACE/article/view/557arabic text, automatic indexing, building thesaurus, frequent sets, json dictionary, synonyms
spellingShingle Ramzi A. Haraty
Sanaa Kaddoura
Sultan Al Jahdali
Nour K. Masri
An innovative automatic indexing method for Arabic text
Advances in Computing and Engineering
arabic text, automatic indexing, building thesaurus, frequent sets, json dictionary, synonyms
title An innovative automatic indexing method for Arabic text
title_full An innovative automatic indexing method for Arabic text
title_fullStr An innovative automatic indexing method for Arabic text
title_full_unstemmed An innovative automatic indexing method for Arabic text
title_short An innovative automatic indexing method for Arabic text
title_sort innovative automatic indexing method for arabic text
topic arabic text, automatic indexing, building thesaurus, frequent sets, json dictionary, synonyms
url http://apc.aast.edu/ojs/index.php/ACE/article/view/557
work_keys_str_mv AT ramziaharaty aninnovativeautomaticindexingmethodforarabictext
AT sanaakaddoura aninnovativeautomaticindexingmethodforarabictext
AT sultanaljahdali aninnovativeautomaticindexingmethodforarabictext
AT nourkmasri aninnovativeautomaticindexingmethodforarabictext
AT ramziaharaty innovativeautomaticindexingmethodforarabictext
AT sanaakaddoura innovativeautomaticindexingmethodforarabictext
AT sultanaljahdali innovativeautomaticindexingmethodforarabictext
AT nourkmasri innovativeautomaticindexingmethodforarabictext