NUWT: JAWI-SPECIFIC BUCKWALTER CORPUS FOR MALAY WORD TOKENIZATION
This paper describes the design and creation of a monolingual parallel corpus for the Malay language written in Jawi. This paper proposes a new corpus called the National University of Malaysia Word Tokenization (NUWT) corpora To the best of our knowledge, currently, there is no sufficiently compr...
Main Authors: | Juhaida Abu Bakar, Khairuddin Omar, Mohammad Faidzul Nasrudin, Mohd Zamri Murah |
---|---|
Format: | Article |
Language: | English |
Published: |
UUM Press
2016-05-01
|
Series: | Journal of ICT |
Subjects: | |
Online Access: | https://e-journal.uum.edu.my/index.php/jict/article/view/8172 |
Similar Items
-
NUWT: Jawi-Specific Buckwalter Corpus for Malay Word Tokenization
by: Abu Bakar, Juhaida, et al.
Published: (2016) -
NUWT: Jawi-specific Buckwalter corpus for Malays word tokenization
by: Abu Bakar, Juhaida, et al.
Published: (2016) -
Maxenttagger for Malays Jawi POS-tags
by: Abu Bakar, Juhaida, et al.
Published: (2017) -
Part-of-speech for old Malay manuscript corpus: A review
by: Abu Bakar, Juhaida, et al.
Published: (2013) -
Tokenization of Assets: Security Tokens in Liechtenstein and Switzerland
by: Angelika K. Layr
Published: (2021-09-01)