Malay part-of-speech tagging: An me-based approach

Research on Malay Part-of-Speech (POS) tagging has greatly increased over the past few years. Based on previous literature, POS-tags are known as the first phase in the automated text analysis; and the development of language technologies can barely initiate without this initial phase.Malay languag...

Full description

Bibliographic Details
Main Authors: Abu Bakar, Juhaida, Omar, Khairuddin, Nasrudin, Mohammad Faidzul, Murah, Mohd Zamri
Format: Conference or Workshop Item
Language:English
Published: 2016
Subjects:
Online Access:https://repo.uum.edu.my/id/eprint/23520/1/ICT4T2016%20246%20251.pdf
_version_ 1803628554254024704
author Abu Bakar, Juhaida
Omar, Khairuddin
Nasrudin, Mohammad Faidzul
Murah, Mohd Zamri
author_facet Abu Bakar, Juhaida
Omar, Khairuddin
Nasrudin, Mohammad Faidzul
Murah, Mohd Zamri
author_sort Abu Bakar, Juhaida
collection UUM
description Research on Malay Part-of-Speech (POS) tagging has greatly increased over the past few years. Based on previous literature, POS-tags are known as the first phase in the automated text analysis; and the development of language technologies can barely initiate without this initial phase.Malay language can be written in either the Roman or Jawi scripts.We highlight the existing POS-tags approaches and techniques; and suggest the development of Malay Jawi POS-tags using ME-based approach – using specific contextual information of Malay corpora that has been written in Jawi script. We conduct our test on NUWT Corpus.It has been found out that the ME-based approach reaches an accuracy level of 89.30% in average; and yields the precision and recall rates of 94% for the highest level of accuracy achieved.
first_indexed 2024-07-04T06:23:48Z
format Conference or Workshop Item
id uum-23520
institution Universiti Utara Malaysia
language English
last_indexed 2024-07-04T06:23:48Z
publishDate 2016
record_format dspace
spelling uum-235202018-02-28T02:02:36Z https://repo.uum.edu.my/id/eprint/23520/ Malay part-of-speech tagging: An me-based approach Abu Bakar, Juhaida Omar, Khairuddin Nasrudin, Mohammad Faidzul Murah, Mohd Zamri QA75 Electronic computers. Computer science Research on Malay Part-of-Speech (POS) tagging has greatly increased over the past few years. Based on previous literature, POS-tags are known as the first phase in the automated text analysis; and the development of language technologies can barely initiate without this initial phase.Malay language can be written in either the Roman or Jawi scripts.We highlight the existing POS-tags approaches and techniques; and suggest the development of Malay Jawi POS-tags using ME-based approach – using specific contextual information of Malay corpora that has been written in Jawi script. We conduct our test on NUWT Corpus.It has been found out that the ME-based approach reaches an accuracy level of 89.30% in average; and yields the precision and recall rates of 94% for the highest level of accuracy achieved. 2016-04-05 Conference or Workshop Item NonPeerReviewed application/pdf en https://repo.uum.edu.my/id/eprint/23520/1/ICT4T2016%20246%20251.pdf Abu Bakar, Juhaida and Omar, Khairuddin and Nasrudin, Mohammad Faidzul and Murah, Mohd Zamri (2016) Malay part-of-speech tagging: An me-based approach. In: International Conference on ICT for Transformation 2016, 05-07 April 2016, Center for postgraduate UMS Sabah Malaysia.. (Unpublished)
spellingShingle QA75 Electronic computers. Computer science
Abu Bakar, Juhaida
Omar, Khairuddin
Nasrudin, Mohammad Faidzul
Murah, Mohd Zamri
Malay part-of-speech tagging: An me-based approach
title Malay part-of-speech tagging: An me-based approach
title_full Malay part-of-speech tagging: An me-based approach
title_fullStr Malay part-of-speech tagging: An me-based approach
title_full_unstemmed Malay part-of-speech tagging: An me-based approach
title_short Malay part-of-speech tagging: An me-based approach
title_sort malay part of speech tagging an me based approach
topic QA75 Electronic computers. Computer science
url https://repo.uum.edu.my/id/eprint/23520/1/ICT4T2016%20246%20251.pdf
work_keys_str_mv AT abubakarjuhaida malaypartofspeechtagginganmebasedapproach
AT omarkhairuddin malaypartofspeechtagginganmebasedapproach
AT nasrudinmohammadfaidzul malaypartofspeechtagginganmebasedapproach
AT murahmohdzamri malaypartofspeechtagginganmebasedapproach