Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean

With the advent of robust deep learning, neural machine translation (NMT) has achieved great progress and recently become the dominant paradigm in machine translation (MT). However, it is still confronted with the challenge of word ambiguities that force NMT to choose among several translation candi...

Full description

Bibliographic Details
Main Authors: Quang-Phuoc Nguyen, Anh-Dung Vo, Joon-Choul Shin, Cheol-Young Ock
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8399736/
_version_ 1818644972623101952
author Quang-Phuoc Nguyen
Anh-Dung Vo
Joon-Choul Shin
Cheol-Young Ock
author_facet Quang-Phuoc Nguyen
Anh-Dung Vo
Joon-Choul Shin
Cheol-Young Ock
author_sort Quang-Phuoc Nguyen
collection DOAJ
description With the advent of robust deep learning, neural machine translation (NMT) has achieved great progress and recently become the dominant paradigm in machine translation (MT). However, it is still confronted with the challenge of word ambiguities that force NMT to choose among several translation candidates that represent different senses of an input word. This research presents a case study using Korean word sense disambiguation (WSD) to improve NMT performance. First, we constructed a Korean lexical semantic network (LSN) as a large-scale lexical semantic knowledge base. Then, based on the Korean LSN, we built a Korean WSD preprocessor that can annotate the correct sense of Korean words in the training corpus. Finally, we conducted a series of translation experiments using Korean-English, Korean-French, Korean-Spanish, and Korean-Japanese language pairs. The experimental results show that our Korean WSD system can significantly improve the translation quality of NMT in terms of the BLEU, TER, and DLRATIO metrics. On average, it improved the precision by 2.94 BLEU points and improved translation error prevention by 4.04 TER points and 4.51 DLRATIO points for all the language pairs.
first_indexed 2024-12-17T00:23:21Z
format Article
id doaj.art-00017d67480c4314a41af376f01f0239
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-17T00:23:21Z
publishDate 2018-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-00017d67480c4314a41af376f01f02392022-12-21T22:10:31ZengIEEEIEEE Access2169-35362018-01-016385123852310.1109/ACCESS.2018.28512818399736Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in KoreanQuang-Phuoc Nguyen0https://orcid.org/0000-0001-7792-295XAnh-Dung Vo1https://orcid.org/0000-0002-4363-2177Joon-Choul Shin2https://orcid.org/0000-0002-6793-3687Cheol-Young Ock3https://orcid.org/0000-0003-0020-1040Department of IT Convergence, University of Ulsan, Ulsan, South KoreaDepartment of IT Convergence, University of Ulsan, Ulsan, South KoreaDepartment of IT Convergence, University of Ulsan, Ulsan, South KoreaDepartment of IT Convergence, University of Ulsan, Ulsan, South KoreaWith the advent of robust deep learning, neural machine translation (NMT) has achieved great progress and recently become the dominant paradigm in machine translation (MT). However, it is still confronted with the challenge of word ambiguities that force NMT to choose among several translation candidates that represent different senses of an input word. This research presents a case study using Korean word sense disambiguation (WSD) to improve NMT performance. First, we constructed a Korean lexical semantic network (LSN) as a large-scale lexical semantic knowledge base. Then, based on the Korean LSN, we built a Korean WSD preprocessor that can annotate the correct sense of Korean words in the training corpus. Finally, we conducted a series of translation experiments using Korean-English, Korean-French, Korean-Spanish, and Korean-Japanese language pairs. The experimental results show that our Korean WSD system can significantly improve the translation quality of NMT in terms of the BLEU, TER, and DLRATIO metrics. On average, it improved the precision by 2.94 BLEU points and improved translation error prevention by 4.04 TER points and 4.51 DLRATIO points for all the language pairs.https://ieeexplore.ieee.org/document/8399736/Lexical semantic networkneural machine translationword sense disambiguation
spellingShingle Quang-Phuoc Nguyen
Anh-Dung Vo
Joon-Choul Shin
Cheol-Young Ock
Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean
IEEE Access
Lexical semantic network
neural machine translation
word sense disambiguation
title Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean
title_full Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean
title_fullStr Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean
title_full_unstemmed Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean
title_short Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean
title_sort effect of word sense disambiguation on neural machine translation a case study in korean
topic Lexical semantic network
neural machine translation
word sense disambiguation
url https://ieeexplore.ieee.org/document/8399736/
work_keys_str_mv AT quangphuocnguyen effectofwordsensedisambiguationonneuralmachinetranslationacasestudyinkorean
AT anhdungvo effectofwordsensedisambiguationonneuralmachinetranslationacasestudyinkorean
AT joonchoulshin effectofwordsensedisambiguationonneuralmachinetranslationacasestudyinkorean
AT cheolyoungock effectofwordsensedisambiguationonneuralmachinetranslationacasestudyinkorean