Development of a word segmentation algorithm for Myanmar language

This study is to develop a word segmentation algorithm and solution for Myanmar language. This is a first-of-its-kind for word segmentation in Myanmar language using the Unicode Standard version 5.1. The Unicode standard for Myanmar character set had not been very stable in the past and the recent v...

Full description

Bibliographic Details
Main Author: U Tun Thura Thet
Other Authors: Na, Jin Cheon
Format: Thesis
Published: 2008
Subjects:
Online Access:http://hdl.handle.net/10356/1939
_version_ 1824455058872336384
author U Tun Thura Thet
author2 Na, Jin Cheon
author_facet Na, Jin Cheon
U Tun Thura Thet
author_sort U Tun Thura Thet
collection NTU
description This study is to develop a word segmentation algorithm and solution for Myanmar language. This is a first-of-its-kind for word segmentation in Myanmar language using the Unicode Standard version 5.1. The Unicode standard for Myanmar character set had not been very stable in the past and the recent version 5.1 is now included with some significant changes in order to address the major issues faced in the previous versions. The literature review for research covers the studies of not only Myanmar script but also the other similar scripts such as Thai, Cambodia and Laos. Some word segmentation approaches for Thai, Vietnamese and Chinese languages which are relevant to the studies are also reviewed to understand how other solutions were developed and evaluated.
first_indexed 2025-02-19T03:32:11Z
format Thesis
id ntu-10356/1939
institution Nanyang Technological University
last_indexed 2025-02-19T03:32:11Z
publishDate 2008
record_format dspace
spelling ntu-10356/19392019-12-10T12:02:59Z Development of a word segmentation algorithm for Myanmar language U Tun Thura Thet Na, Jin Cheon Wee Kim Wee School of Communication and Information Wu, Horng Jyh DRNTU::Library and information science::Libraries::Technologies This study is to develop a word segmentation algorithm and solution for Myanmar language. This is a first-of-its-kind for word segmentation in Myanmar language using the Unicode Standard version 5.1. The Unicode standard for Myanmar character set had not been very stable in the past and the recent version 5.1 is now included with some significant changes in order to address the major issues faced in the previous versions. The literature review for research covers the studies of not only Myanmar script but also the other similar scripts such as Thai, Cambodia and Laos. Some word segmentation approaches for Thai, Vietnamese and Chinese languages which are relevant to the studies are also reviewed to understand how other solutions were developed and evaluated. Master of Science (Information Studies) 2008-09-10T08:37:32Z 2008-09-10T08:37:32Z 2006 2006 Thesis http://hdl.handle.net/10356/1939 Nanyang Technological University application/pdf
spellingShingle DRNTU::Library and information science::Libraries::Technologies
U Tun Thura Thet
Development of a word segmentation algorithm for Myanmar language
title Development of a word segmentation algorithm for Myanmar language
title_full Development of a word segmentation algorithm for Myanmar language
title_fullStr Development of a word segmentation algorithm for Myanmar language
title_full_unstemmed Development of a word segmentation algorithm for Myanmar language
title_short Development of a word segmentation algorithm for Myanmar language
title_sort development of a word segmentation algorithm for myanmar language
topic DRNTU::Library and information science::Libraries::Technologies
url http://hdl.handle.net/10356/1939
work_keys_str_mv AT utunthurathet developmentofawordsegmentationalgorithmformyanmarlanguage