Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean
With the advent of robust deep learning, neural machine translation (NMT) has achieved great progress and recently become the dominant paradigm in machine translation (MT). However, it is still confronted with the challenge of word ambiguities that force NMT to choose among several translation candi...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2018-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8399736/ |
_version_ | 1818644972623101952 |
---|---|
author | Quang-Phuoc Nguyen Anh-Dung Vo Joon-Choul Shin Cheol-Young Ock |
author_facet | Quang-Phuoc Nguyen Anh-Dung Vo Joon-Choul Shin Cheol-Young Ock |
author_sort | Quang-Phuoc Nguyen |
collection | DOAJ |
description | With the advent of robust deep learning, neural machine translation (NMT) has achieved great progress and recently become the dominant paradigm in machine translation (MT). However, it is still confronted with the challenge of word ambiguities that force NMT to choose among several translation candidates that represent different senses of an input word. This research presents a case study using Korean word sense disambiguation (WSD) to improve NMT performance. First, we constructed a Korean lexical semantic network (LSN) as a large-scale lexical semantic knowledge base. Then, based on the Korean LSN, we built a Korean WSD preprocessor that can annotate the correct sense of Korean words in the training corpus. Finally, we conducted a series of translation experiments using Korean-English, Korean-French, Korean-Spanish, and Korean-Japanese language pairs. The experimental results show that our Korean WSD system can significantly improve the translation quality of NMT in terms of the BLEU, TER, and DLRATIO metrics. On average, it improved the precision by 2.94 BLEU points and improved translation error prevention by 4.04 TER points and 4.51 DLRATIO points for all the language pairs. |
first_indexed | 2024-12-17T00:23:21Z |
format | Article |
id | doaj.art-00017d67480c4314a41af376f01f0239 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-17T00:23:21Z |
publishDate | 2018-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-00017d67480c4314a41af376f01f02392022-12-21T22:10:31ZengIEEEIEEE Access2169-35362018-01-016385123852310.1109/ACCESS.2018.28512818399736Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in KoreanQuang-Phuoc Nguyen0https://orcid.org/0000-0001-7792-295XAnh-Dung Vo1https://orcid.org/0000-0002-4363-2177Joon-Choul Shin2https://orcid.org/0000-0002-6793-3687Cheol-Young Ock3https://orcid.org/0000-0003-0020-1040Department of IT Convergence, University of Ulsan, Ulsan, South KoreaDepartment of IT Convergence, University of Ulsan, Ulsan, South KoreaDepartment of IT Convergence, University of Ulsan, Ulsan, South KoreaDepartment of IT Convergence, University of Ulsan, Ulsan, South KoreaWith the advent of robust deep learning, neural machine translation (NMT) has achieved great progress and recently become the dominant paradigm in machine translation (MT). However, it is still confronted with the challenge of word ambiguities that force NMT to choose among several translation candidates that represent different senses of an input word. This research presents a case study using Korean word sense disambiguation (WSD) to improve NMT performance. First, we constructed a Korean lexical semantic network (LSN) as a large-scale lexical semantic knowledge base. Then, based on the Korean LSN, we built a Korean WSD preprocessor that can annotate the correct sense of Korean words in the training corpus. Finally, we conducted a series of translation experiments using Korean-English, Korean-French, Korean-Spanish, and Korean-Japanese language pairs. The experimental results show that our Korean WSD system can significantly improve the translation quality of NMT in terms of the BLEU, TER, and DLRATIO metrics. On average, it improved the precision by 2.94 BLEU points and improved translation error prevention by 4.04 TER points and 4.51 DLRATIO points for all the language pairs.https://ieeexplore.ieee.org/document/8399736/Lexical semantic networkneural machine translationword sense disambiguation |
spellingShingle | Quang-Phuoc Nguyen Anh-Dung Vo Joon-Choul Shin Cheol-Young Ock Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean IEEE Access Lexical semantic network neural machine translation word sense disambiguation |
title | Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean |
title_full | Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean |
title_fullStr | Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean |
title_full_unstemmed | Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean |
title_short | Effect of Word Sense Disambiguation on Neural Machine Translation: A Case Study in Korean |
title_sort | effect of word sense disambiguation on neural machine translation a case study in korean |
topic | Lexical semantic network neural machine translation word sense disambiguation |
url | https://ieeexplore.ieee.org/document/8399736/ |
work_keys_str_mv | AT quangphuocnguyen effectofwordsensedisambiguationonneuralmachinetranslationacasestudyinkorean AT anhdungvo effectofwordsensedisambiguationonneuralmachinetranslationacasestudyinkorean AT joonchoulshin effectofwordsensedisambiguationonneuralmachinetranslationacasestudyinkorean AT cheolyoungock effectofwordsensedisambiguationonneuralmachinetranslationacasestudyinkorean |