Revisiting Multi-Domain Machine Translation

AbstractWhen building machine translation systems, one often needs to make the best out of heterogeneous sets of parallel data in training, and to robustly handle inputs from unexpected domains in testing. This multi-domain scenario has attracted a lot of recent work that fall under...

Full description

Bibliographic Details
Main Authors: MinhQuang Pham, Josep Maria Crego, François Yvon
Format: Article
Language:English
Published: The MIT Press 2021-01-01
Series:Transactions of the Association for Computational Linguistics
Online Access:https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00351/97775/Revisiting-Multi-Domain-Machine-Translation
_version_ 1818193481756049408
author MinhQuang Pham
Josep Maria Crego
François Yvon
author_facet MinhQuang Pham
Josep Maria Crego
François Yvon
author_sort MinhQuang Pham
collection DOAJ
description AbstractWhen building machine translation systems, one often needs to make the best out of heterogeneous sets of parallel data in training, and to robustly handle inputs from unexpected domains in testing. This multi-domain scenario has attracted a lot of recent work that fall under the general umbrella of transfer learning. In this study, we revisit multi-domain machine translation, with the aim to formulate the motivations for developing such systems and the associated expectations with respect to performance. Our experiments with a large sample of multi-domain systems show that most of these expectations are hardly met and suggest that further work is needed to better analyze the current behaviour of multi-domain systems and to make them fully hold their promises.
first_indexed 2024-12-12T00:47:05Z
format Article
id doaj.art-73a669e6e7db48679b3dc19578d47cc1
institution Directory Open Access Journal
issn 2307-387X
language English
last_indexed 2024-12-12T00:47:05Z
publishDate 2021-01-01
publisher The MIT Press
record_format Article
series Transactions of the Association for Computational Linguistics
spelling doaj.art-73a669e6e7db48679b3dc19578d47cc12022-12-22T00:44:06ZengThe MIT PressTransactions of the Association for Computational Linguistics2307-387X2021-01-019173510.1162/tacl_a_00351Revisiting Multi-Domain Machine TranslationMinhQuang Pham0Josep Maria Crego1François Yvon2Université Paris-Saclay, CNRS, LIMSI, 91400, OrsaySYSTRAN, 5 rue Feydeau, 75002 Paris, France. josep.crego@systrangroup.comUniversité Paris-Saclay, CNRS, LIMSI, 91400, Orsay, France. francois.yvon@limsi.fr AbstractWhen building machine translation systems, one often needs to make the best out of heterogeneous sets of parallel data in training, and to robustly handle inputs from unexpected domains in testing. This multi-domain scenario has attracted a lot of recent work that fall under the general umbrella of transfer learning. In this study, we revisit multi-domain machine translation, with the aim to formulate the motivations for developing such systems and the associated expectations with respect to performance. Our experiments with a large sample of multi-domain systems show that most of these expectations are hardly met and suggest that further work is needed to better analyze the current behaviour of multi-domain systems and to make them fully hold their promises.https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00351/97775/Revisiting-Multi-Domain-Machine-Translation
spellingShingle MinhQuang Pham
Josep Maria Crego
François Yvon
Revisiting Multi-Domain Machine Translation
Transactions of the Association for Computational Linguistics
title Revisiting Multi-Domain Machine Translation
title_full Revisiting Multi-Domain Machine Translation
title_fullStr Revisiting Multi-Domain Machine Translation
title_full_unstemmed Revisiting Multi-Domain Machine Translation
title_short Revisiting Multi-Domain Machine Translation
title_sort revisiting multi domain machine translation
url https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00351/97775/Revisiting-Multi-Domain-Machine-Translation
work_keys_str_mv AT minhquangpham revisitingmultidomainmachinetranslation
AT josepmariacrego revisitingmultidomainmachinetranslation
AT francoisyvon revisitingmultidomainmachinetranslation