A rule-based stemmer for Arabic Gulf dialect
Arabic dialects arewidely used from many years ago instead of Modern Standard Arabic language in many fields. The presence of dialects in any language is a big challenge. Dialects add a new set of variational dimensions in some fields like natural language processing, information retrieval and even...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2015-04-01
|
Series: | Journal of King Saud University: Computer and Information Sciences |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1319157815000191 |
_version_ | 1818909214825775104 |
---|---|
author | Belal Abuata Asma Al-Omari |
author_facet | Belal Abuata Asma Al-Omari |
author_sort | Belal Abuata |
collection | DOAJ |
description | Arabic dialects arewidely used from many years ago instead of Modern Standard Arabic language in many fields. The presence of dialects in any language is a big challenge. Dialects add a new set of variational dimensions in some fields like natural language processing, information retrieval and even in Arabic chatting between different Arab nationals. Spoken dialects have no standard morphological, phonological and lexical like Modern Standard Arabic. Hence, the objective of this paper is to describe a procedure or algorithm by which a stem for the Arabian Gulf dialect can be defined. The algorithm is rule based. Special rules are created to remove the suffixes and prefixes of the dialect words. Also, the algorithm applies rules related to the word size and the relation between adjacent letters. The algorithm was tested for a number of words and given a good correct stem ratio. The algorithm is also compared with two Modern Standard Arabic algorithms. The results showed that Modern Standard Arabic stemmers performed poorly with Arabic Gulf dialect and our algorithm performed poorly when applied for Modern Standard Arabic words. |
first_indexed | 2024-12-19T22:23:22Z |
format | Article |
id | doaj.art-011217dcd71043599e8715cb36eed1ab |
institution | Directory Open Access Journal |
issn | 1319-1578 |
language | English |
last_indexed | 2024-12-19T22:23:22Z |
publishDate | 2015-04-01 |
publisher | Elsevier |
record_format | Article |
series | Journal of King Saud University: Computer and Information Sciences |
spelling | doaj.art-011217dcd71043599e8715cb36eed1ab2022-12-21T20:03:34ZengElsevierJournal of King Saud University: Computer and Information Sciences1319-15782015-04-0127210411210.1016/j.jksuci.2014.04.003A rule-based stemmer for Arabic Gulf dialectBelal AbuataAsma Al-OmariArabic dialects arewidely used from many years ago instead of Modern Standard Arabic language in many fields. The presence of dialects in any language is a big challenge. Dialects add a new set of variational dimensions in some fields like natural language processing, information retrieval and even in Arabic chatting between different Arab nationals. Spoken dialects have no standard morphological, phonological and lexical like Modern Standard Arabic. Hence, the objective of this paper is to describe a procedure or algorithm by which a stem for the Arabian Gulf dialect can be defined. The algorithm is rule based. Special rules are created to remove the suffixes and prefixes of the dialect words. Also, the algorithm applies rules related to the word size and the relation between adjacent letters. The algorithm was tested for a number of words and given a good correct stem ratio. The algorithm is also compared with two Modern Standard Arabic algorithms. The results showed that Modern Standard Arabic stemmers performed poorly with Arabic Gulf dialect and our algorithm performed poorly when applied for Modern Standard Arabic words.http://www.sciencedirect.com/science/article/pii/S1319157815000191Arabic dialect stemmerGulf dialectRule base stemmingArabic NLP |
spellingShingle | Belal Abuata Asma Al-Omari A rule-based stemmer for Arabic Gulf dialect Journal of King Saud University: Computer and Information Sciences Arabic dialect stemmer Gulf dialect Rule base stemming Arabic NLP |
title | A rule-based stemmer for Arabic Gulf dialect |
title_full | A rule-based stemmer for Arabic Gulf dialect |
title_fullStr | A rule-based stemmer for Arabic Gulf dialect |
title_full_unstemmed | A rule-based stemmer for Arabic Gulf dialect |
title_short | A rule-based stemmer for Arabic Gulf dialect |
title_sort | rule based stemmer for arabic gulf dialect |
topic | Arabic dialect stemmer Gulf dialect Rule base stemming Arabic NLP |
url | http://www.sciencedirect.com/science/article/pii/S1319157815000191 |
work_keys_str_mv | AT belalabuata arulebasedstemmerforarabicgulfdialect AT asmaalomari arulebasedstemmerforarabicgulfdialect AT belalabuata rulebasedstemmerforarabicgulfdialect AT asmaalomari rulebasedstemmerforarabicgulfdialect |