Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation

AbstractHuman evaluation of modern high-quality machine translation systems is a difficult problem, and there is increasing evidence that inadequate evaluation procedures can lead to erroneous conclusions. While there has been considerable research on human evaluation, the field stil...

Full description

Bibliographic Details
Main Authors: Markus Freitag, George Foster, David Grangier, Viresh Ratnakar, Qijun Tan, Wolfgang Macherey
Format: Article
Language:English
Published: The MIT Press 2021-01-01
Series:Transactions of the Association for Computational Linguistics
Online Access:https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00437/108866/Experts-Errors-and-Context-A-Large-Scale-Study-of