NUWT: Jawi-specific Buckwalter corpus for Malays word tokenization
This paper describes the design and creation of a monolingual parallel corpus for the Malay language written in Jawi.This paper proposes a new corpus called the National University of Malaysia Word Tokenization (NUWT) corpora To the best of our knowledge, currently, there is no sufficiently comprehe...
Main Authors: | Abu Bakar, Juhaida, Omar, Khairuddin, Nasrudin, Mohammad Faidzul, Murah, Mohd Zamri |
---|---|
Format: | Article |
Language: | English |
Published: |
Universiti Utara Malaysia
2016
|
Subjects: | |
Online Access: | https://repo.uum.edu.my/id/eprint/18485/1/JICT%2015%20%201%202016%20%20107%E2%80%93131.pdf |
Similar Items
-
NUWT: Jawi-Specific Buckwalter Corpus for Malay Word Tokenization
by: Abu Bakar, Juhaida, et al.
Published: (2016) -
Maxenttagger for Malays Jawi POS-tags
by: Abu Bakar, Juhaida, et al.
Published: (2017) -
Part-of-speech for old Malay manuscript corpus: A review
by: Abu Bakar, Juhaida, et al.
Published: (2013) -
Tulisan Jawi: Tulisan Serantau
by: Abd Jalil, Borham
Published: (2012) -
Sejarah awal tulisan Jawi
by: Hashim, M.
Published: (1994)