Malay part-of-speech tagging: An me-based approach
Research on Malay Part-of-Speech (POS) tagging has greatly increased over the past few years. Based on previous literature, POS-tags are known as the first phase in the automated text analysis; and the development of language technologies can barely initiate without this initial phase.Malay languag...
Main Authors: | , , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | https://repo.uum.edu.my/id/eprint/23520/1/ICT4T2016%20246%20251.pdf |
Summary: | Research on Malay Part-of-Speech (POS) tagging has greatly increased over the past few years. Based on previous literature, POS-tags are known as the first phase in the automated text analysis; and the development of language technologies can
barely initiate without this initial phase.Malay language can be written in either the Roman or Jawi scripts.We highlight the existing POS-tags approaches and techniques; and suggest the development of Malay Jawi POS-tags using ME-based
approach – using specific contextual information of Malay corpora that has been written in Jawi script. We conduct our test on NUWT Corpus.It has been found out that the ME-based approach reaches an accuracy level of 89.30% in average; and yields the precision and recall rates of 94% for the highest level of accuracy achieved. |
---|