Investigating backtranslation for the improvement of English-Irish machine translation
In this paper, we discuss the difficulties of building reliable machine translation systems for the English-Irish (EN-GA) language pair. In the context of limited datasets, we report on assessing the use of backtranslation as a method for creating artificial EN-GA data to increase training data for...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
The Irish Association for Applied Linguistics
2019-11-01
|
Series: | Teanga: The Journal of the Irish Association for Applied Linguistics |
Subjects: | |
Online Access: | https://journal.iraal.ie/index.php/teanga/article/view/88 |
Summary: | In this paper, we discuss the difficulties of building reliable machine translation systems for the English-Irish (EN-GA) language pair. In the context of limited datasets, we report on assessing the use of backtranslation as a method for creating artificial EN-GA data to increase training data for use state-of-the-art data-driven translation systems. We compare our results to earlier work on EN-GA machine translation by Dowling et al (2016, 2017, 2018) showing that while our own systems do not compare in quality with respect to traditionally reported BLEU metrics, we provide a linguistic analysis to suggest that future work with domain specific data may prove more successful. |
---|---|
ISSN: | 0332-205X 2565-6325 |