Introducing a machine-based approach for Word Sense Disambiguation: using Lesk algorithm and Part Of Speech tagging
Present study introduces a machine-based approach for Word Sense Disambiguation (WSD). In Persian, a morphologically complex language, lots of homographs are made; one way for doing WSD is allocating the right Part Of Speech (POS) tags to words, prior to WSD. Since the frequency of noun and adjectiv...
Main Author: | |
---|---|
Format: | Article |
Language: | fas |
Published: |
Iranian Research Institute for Information and Technology
2018-06-01
|
Series: | Iranian Journal of Information Processing & Management |
Subjects: | |
Online Access: | http://jipm.irandoc.ac.ir/browse.php?a_code=A-10-3228-2&slc_lang=en&sid=1 |
_version_ | 1818691893855256576 |
---|---|
author | Elham Alayiaboozar |
author_facet | Elham Alayiaboozar |
author_sort | Elham Alayiaboozar |
collection | DOAJ |
description | Present study introduces a machine-based approach for Word Sense Disambiguation (WSD). In Persian, a morphologically complex language, lots of homographs are made; one way for doing WSD is allocating the right Part Of Speech (POS) tags to words, prior to WSD. Since the frequency of noun and adjective homographs in different Persian text corpuses is high, POS disambiguation of such homographs seems to be necessary for WSD. This paper introduces an approach in which first POS tagging is done, then the output, which is tagged sentences, enters the next step which is POS disambiguation of Persian nouns and adjective homographs; then the output of this step enters the final step which is applying the Lesk algorithm(a kind of unsupervised learning) for WSD. The proposed approach speeds up the WSD procedure by filtering the only relevant glosses (exist in dictionary) and increases the accuracy of the WSD procedure as well. |
first_indexed | 2024-12-17T12:49:08Z |
format | Article |
id | doaj.art-7b00f74ad1a742b3be9ce35f5e2eeb0f |
institution | Directory Open Access Journal |
issn | 2251-8223 2251-8231 |
language | fas |
last_indexed | 2024-12-17T12:49:08Z |
publishDate | 2018-06-01 |
publisher | Iranian Research Institute for Information and Technology |
record_format | Article |
series | Iranian Journal of Information Processing & Management |
spelling | doaj.art-7b00f74ad1a742b3be9ce35f5e2eeb0f2022-12-21T21:47:39ZfasIranian Research Institute for Information and TechnologyIranian Journal of Information Processing & Management2251-82232251-82312018-06-0133311651182Introducing a machine-based approach for Word Sense Disambiguation: using Lesk algorithm and Part Of Speech taggingElham Alayiaboozar0 Iranian Research Institute for Information Science and Technology(IranDoc) Present study introduces a machine-based approach for Word Sense Disambiguation (WSD). In Persian, a morphologically complex language, lots of homographs are made; one way for doing WSD is allocating the right Part Of Speech (POS) tags to words, prior to WSD. Since the frequency of noun and adjective homographs in different Persian text corpuses is high, POS disambiguation of such homographs seems to be necessary for WSD. This paper introduces an approach in which first POS tagging is done, then the output, which is tagged sentences, enters the next step which is POS disambiguation of Persian nouns and adjective homographs; then the output of this step enters the final step which is applying the Lesk algorithm(a kind of unsupervised learning) for WSD. The proposed approach speeds up the WSD procedure by filtering the only relevant glosses (exist in dictionary) and increases the accuracy of the WSD procedure as well.http://jipm.irandoc.ac.ir/browse.php?a_code=A-10-3228-2&slc_lang=en&sid=1homographs Word Sense Disambiguation Part Of Speech tagging disambiguation of Persian nouns and adjective homographs Lesk algorithm |
spellingShingle | Elham Alayiaboozar Introducing a machine-based approach for Word Sense Disambiguation: using Lesk algorithm and Part Of Speech tagging Iranian Journal of Information Processing & Management homographs Word Sense Disambiguation Part Of Speech tagging disambiguation of Persian nouns and adjective homographs Lesk algorithm |
title | Introducing a machine-based approach for Word Sense Disambiguation: using Lesk algorithm and Part Of Speech tagging |
title_full | Introducing a machine-based approach for Word Sense Disambiguation: using Lesk algorithm and Part Of Speech tagging |
title_fullStr | Introducing a machine-based approach for Word Sense Disambiguation: using Lesk algorithm and Part Of Speech tagging |
title_full_unstemmed | Introducing a machine-based approach for Word Sense Disambiguation: using Lesk algorithm and Part Of Speech tagging |
title_short | Introducing a machine-based approach for Word Sense Disambiguation: using Lesk algorithm and Part Of Speech tagging |
title_sort | introducing a machine based approach for word sense disambiguation using lesk algorithm and part of speech tagging |
topic | homographs Word Sense Disambiguation Part Of Speech tagging disambiguation of Persian nouns and adjective homographs Lesk algorithm |
url | http://jipm.irandoc.ac.ir/browse.php?a_code=A-10-3228-2&slc_lang=en&sid=1 |
work_keys_str_mv | AT elhamalayiaboozar introducingamachinebasedapproachforwordsensedisambiguationusingleskalgorithmandpartofspeechtagging |