Impact of Sentence Representation Matching in Neural Machine Translation

Most neural machine translation models are implemented as a conditional language model framework composed of encoder and decoder models. This framework learns complex and long-distant dependencies, but its deep structure causes inefficiency in training. Matching vector representations of source and...

Full description

Bibliographic Details
Main Authors: Heeseung Jung, Kangil Kim, Jong-Hun Shin, Seung-Hoon Na, Sangkeun Jung, Sangmin Woo
Format: Article
Language:English
Published: MDPI AG 2022-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/12/3/1313
_version_ 1797489214122622976
author Heeseung Jung
Kangil Kim
Jong-Hun Shin
Seung-Hoon Na
Sangkeun Jung
Sangmin Woo
author_facet Heeseung Jung
Kangil Kim
Jong-Hun Shin
Seung-Hoon Na
Sangkeun Jung
Sangmin Woo
author_sort Heeseung Jung
collection DOAJ
description Most neural machine translation models are implemented as a conditional language model framework composed of encoder and decoder models. This framework learns complex and long-distant dependencies, but its deep structure causes inefficiency in training. Matching vector representations of source and target sentences improves the inefficiency by shortening the depth from parameters to costs and generalizes NMTs with a different perspective to cross-entropy loss. In this paper, we propose matching methods to derive the cost based on constant word-embedding vectors of source and target sentences. To find the best method, we analyze the impact of the methods with varying structures, distance metrics, and model capacity in a French to English translation task. An optimally configured method is applied to English translation tasks from and to French, Spanish, and German. In the tasks, the method showed performance improvement by 3.23 BLEU at maximum, with an improvement of 0.71 on average. We evaluated the robustness of this method to various embedding distributions and models, such as conventional gated structures and transformer networks, and empirical results showed that it has a higher chance to improve performance in those models.
first_indexed 2024-03-10T00:14:17Z
format Article
id doaj.art-a840124518574967a7c46dd784f8628f
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T00:14:17Z
publishDate 2022-01-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-a840124518574967a7c46dd784f8628f2023-11-23T15:55:31ZengMDPI AGApplied Sciences2076-34172022-01-01123131310.3390/app12031313Impact of Sentence Representation Matching in Neural Machine TranslationHeeseung Jung0Kangil Kim1Jong-Hun Shin2Seung-Hoon Na3Sangkeun Jung4Sangmin Woo5Gwangju Institute of Science and Technology (GIST), Gwangju 61005, KoreaGwangju Institute of Science and Technology (GIST), Gwangju 61005, KoreaElectronics and Telecommunications Research Institute (ETRI), Gwangju 61012, KoreaDepartment of Computer Science, Jeonbuk National University, Jeonju-si 54896, KoreaComputer Science and Engineering, Chungnam National University, Daejeon 34134, KoreaKorea Advanced Institute of Science and Technology, Daejeon 34141, KoreaMost neural machine translation models are implemented as a conditional language model framework composed of encoder and decoder models. This framework learns complex and long-distant dependencies, but its deep structure causes inefficiency in training. Matching vector representations of source and target sentences improves the inefficiency by shortening the depth from parameters to costs and generalizes NMTs with a different perspective to cross-entropy loss. In this paper, we propose matching methods to derive the cost based on constant word-embedding vectors of source and target sentences. To find the best method, we analyze the impact of the methods with varying structures, distance metrics, and model capacity in a French to English translation task. An optimally configured method is applied to English translation tasks from and to French, Spanish, and German. In the tasks, the method showed performance improvement by 3.23 BLEU at maximum, with an improvement of 0.71 on average. We evaluated the robustness of this method to various embedding distributions and models, such as conventional gated structures and transformer networks, and empirical results showed that it has a higher chance to improve performance in those models.https://www.mdpi.com/2076-3417/12/3/1313recurrent neural networkmachine translationsimilaritysentence representationguiding pressure
spellingShingle Heeseung Jung
Kangil Kim
Jong-Hun Shin
Seung-Hoon Na
Sangkeun Jung
Sangmin Woo
Impact of Sentence Representation Matching in Neural Machine Translation
Applied Sciences
recurrent neural network
machine translation
similarity
sentence representation
guiding pressure
title Impact of Sentence Representation Matching in Neural Machine Translation
title_full Impact of Sentence Representation Matching in Neural Machine Translation
title_fullStr Impact of Sentence Representation Matching in Neural Machine Translation
title_full_unstemmed Impact of Sentence Representation Matching in Neural Machine Translation
title_short Impact of Sentence Representation Matching in Neural Machine Translation
title_sort impact of sentence representation matching in neural machine translation
topic recurrent neural network
machine translation
similarity
sentence representation
guiding pressure
url https://www.mdpi.com/2076-3417/12/3/1313
work_keys_str_mv AT heeseungjung impactofsentencerepresentationmatchinginneuralmachinetranslation
AT kangilkim impactofsentencerepresentationmatchinginneuralmachinetranslation
AT jonghunshin impactofsentencerepresentationmatchinginneuralmachinetranslation
AT seunghoonna impactofsentencerepresentationmatchinginneuralmachinetranslation
AT sangkeunjung impactofsentencerepresentationmatchinginneuralmachinetranslation
AT sangminwoo impactofsentencerepresentationmatchinginneuralmachinetranslation