NUWT: JAWI-SPECIFIC BUCKWALTER CORPUS FOR MALAY WORD TOKENIZATION

This paper describes the design and creation of a monolingual parallel corpus for the Malay language written in Jawi. This paper proposes a new corpus called the National University of Malaysia Word Tokenization (NUWT) corpora To the best of our knowledge, currently, there is no sufficiently compr...

Full description

Bibliographic Details
Main Authors: Juhaida Abu Bakar, Khairuddin Omar, Mohammad Faidzul Nasrudin, Mohd Zamri Murah
Format: Article
Language:English
Published: UUM Press 2016-05-01
Series:Journal of ICT
Subjects:
Online Access:https://e-journal.uum.edu.my/index.php/jict/article/view/8172