Residual Information Flow for Neural Machine Translation
Automatic machine translation plays an important role in reducing language barriers between people speaking different languages. Deep neural networks (DNN) have attained major success in diverse research fields such as computer vision, information retrieval, language modelling, and recently machine...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9941153/ |
_version_ | 1811225531222327296 |
---|---|
author | Shereen A. Mohamed Mohamed A. Abdou Ashraf A. Elsayed |
author_facet | Shereen A. Mohamed Mohamed A. Abdou Ashraf A. Elsayed |
author_sort | Shereen A. Mohamed |
collection | DOAJ |
description | Automatic machine translation plays an important role in reducing language barriers between people speaking different languages. Deep neural networks (DNN) have attained major success in diverse research fields such as computer vision, information retrieval, language modelling, and recently machine translation. Neural sequence-to-sequence networks have accomplished noteworthy progress for machine translation. Inspired by the success achieved by residual connections in different applications, in this work, we introduce a novel NMT model that adopts residual connections to achieve better performing automatic translation. Evaluation of the proposed model has shown an improvement in translation accuracy by 0.3 BLEU compared to the original model, using an ensemble of 5 LSTMs. Regarding training time complexity, the proposed model saves about 33% of the time needed by the original model to train datasets of short sentences. Deeper neural networks of the proposed model have shown a good performance in dealing with the vanishing/exploding problems. All experiments have been performed over NVIDIA Tesla V100 32G Passive GPU and using the WMT14 English-German translation task. |
first_indexed | 2024-04-12T09:08:49Z |
format | Article |
id | doaj.art-f9c52a2ec432496aa75b907168a26483 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-12T09:08:49Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-f9c52a2ec432496aa75b907168a264832022-12-22T03:39:02ZengIEEEIEEE Access2169-35362022-01-011011831311832010.1109/ACCESS.2022.32206919941153Residual Information Flow for Neural Machine TranslationShereen A. Mohamed0https://orcid.org/0000-0003-1177-9278Mohamed A. Abdou1https://orcid.org/0000-0003-0391-5946Ashraf A. Elsayed2https://orcid.org/0000-0001-5438-938XDepartment of Mathematics and Computer Science, Faculty of Science, Alexandria University, Alexandria, EgyptInformatics Research Institute, City for Scientific Research and Technology Applications, Alexandria, EgyptDepartment of Mathematics and Computer Science, Faculty of Science, Alexandria University, Alexandria, EgyptAutomatic machine translation plays an important role in reducing language barriers between people speaking different languages. Deep neural networks (DNN) have attained major success in diverse research fields such as computer vision, information retrieval, language modelling, and recently machine translation. Neural sequence-to-sequence networks have accomplished noteworthy progress for machine translation. Inspired by the success achieved by residual connections in different applications, in this work, we introduce a novel NMT model that adopts residual connections to achieve better performing automatic translation. Evaluation of the proposed model has shown an improvement in translation accuracy by 0.3 BLEU compared to the original model, using an ensemble of 5 LSTMs. Regarding training time complexity, the proposed model saves about 33% of the time needed by the original model to train datasets of short sentences. Deeper neural networks of the proposed model have shown a good performance in dealing with the vanishing/exploding problems. All experiments have been performed over NVIDIA Tesla V100 32G Passive GPU and using the WMT14 English-German translation task.https://ieeexplore.ieee.org/document/9941153/Information flowneural machine translationneural sequence-to-sequence networksresidual connectionsWMT14 English-German translation task |
spellingShingle | Shereen A. Mohamed Mohamed A. Abdou Ashraf A. Elsayed Residual Information Flow for Neural Machine Translation IEEE Access Information flow neural machine translation neural sequence-to-sequence networks residual connections WMT14 English-German translation task |
title | Residual Information Flow for Neural Machine Translation |
title_full | Residual Information Flow for Neural Machine Translation |
title_fullStr | Residual Information Flow for Neural Machine Translation |
title_full_unstemmed | Residual Information Flow for Neural Machine Translation |
title_short | Residual Information Flow for Neural Machine Translation |
title_sort | residual information flow for neural machine translation |
topic | Information flow neural machine translation neural sequence-to-sequence networks residual connections WMT14 English-German translation task |
url | https://ieeexplore.ieee.org/document/9941153/ |
work_keys_str_mv | AT shereenamohamed residualinformationflowforneuralmachinetranslation AT mohamedaabdou residualinformationflowforneuralmachinetranslation AT ashrafaelsayed residualinformationflowforneuralmachinetranslation |