Morphological Verb-Aware Tibetan Language Model
The Tibetan language model (TLM) is the key to Tibetan natural language processing. In this paper, we first observe that, different from widely used languages, Tibetan contains many morphological verbs that rarely appear in natural sentences but play a key role in accurate text prediction. This prop...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2019-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8723332/ |
_version_ | 1819133549460062208 |
---|---|
author | Kuntharrgyal Khysru Di Jin Jianwu Dang |
author_facet | Kuntharrgyal Khysru Di Jin Jianwu Dang |
author_sort | Kuntharrgyal Khysru |
collection | DOAJ |
description | The Tibetan language model (TLM) is the key to Tibetan natural language processing. In this paper, we first observe that, different from widely used languages, Tibetan contains many morphological verbs that rarely appear in natural sentences but play a key role in accurate text prediction. This property is usually ignored by existing methods and makes traditional training strategies less effective in constructing accurate and robust TLMs. Hence, we propose a morphological verb-aware TLM by offline learning via a character frequency reweighting strategy and online tuning of discriminative weights conditioned on morphological verbs. However, because of the influence of morphological verbs on the tense and semantics of sentences, it is necessary to consider the morphological verbs in Tibetan. As a result, compared with state-of-the-art methods, our method not only reduces the perplexity but also improves the character error on tasks of the text prediction and automatic speech recognition (ASR). |
first_indexed | 2024-12-22T09:49:04Z |
format | Article |
id | doaj.art-beab411c72404763aa9c9d5ca3e3258c |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-22T09:49:04Z |
publishDate | 2019-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-beab411c72404763aa9c9d5ca3e3258c2022-12-21T18:30:27ZengIEEEIEEE Access2169-35362019-01-017728967290410.1109/ACCESS.2019.29193288723332Morphological Verb-Aware Tibetan Language ModelKuntharrgyal Khysru0https://orcid.org/0000-0002-6673-9583Di Jin1Jianwu Dang2Tianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin, ChinaTianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin, ChinaTianjin Key Laboratory of Cognitive Computing and Application, College of Intelligence and Computing, Tianjin University, Tianjin, ChinaThe Tibetan language model (TLM) is the key to Tibetan natural language processing. In this paper, we first observe that, different from widely used languages, Tibetan contains many morphological verbs that rarely appear in natural sentences but play a key role in accurate text prediction. This property is usually ignored by existing methods and makes traditional training strategies less effective in constructing accurate and robust TLMs. Hence, we propose a morphological verb-aware TLM by offline learning via a character frequency reweighting strategy and online tuning of discriminative weights conditioned on morphological verbs. However, because of the influence of morphological verbs on the tense and semantics of sentences, it is necessary to consider the morphological verbs in Tibetan. As a result, compared with state-of-the-art methods, our method not only reduces the perplexity but also improves the character error on tasks of the text prediction and automatic speech recognition (ASR).https://ieeexplore.ieee.org/document/8723332/Tibetan language modeltext predictionautomatic speech recognitionmorphological verb-aware model |
spellingShingle | Kuntharrgyal Khysru Di Jin Jianwu Dang Morphological Verb-Aware Tibetan Language Model IEEE Access Tibetan language model text prediction automatic speech recognition morphological verb-aware model |
title | Morphological Verb-Aware Tibetan Language Model |
title_full | Morphological Verb-Aware Tibetan Language Model |
title_fullStr | Morphological Verb-Aware Tibetan Language Model |
title_full_unstemmed | Morphological Verb-Aware Tibetan Language Model |
title_short | Morphological Verb-Aware Tibetan Language Model |
title_sort | morphological verb aware tibetan language model |
topic | Tibetan language model text prediction automatic speech recognition morphological verb-aware model |
url | https://ieeexplore.ieee.org/document/8723332/ |
work_keys_str_mv | AT kuntharrgyalkhysru morphologicalverbawaretibetanlanguagemodel AT dijin morphologicalverbawaretibetanlanguagemodel AT jianwudang morphologicalverbawaretibetanlanguagemodel |