Dataset for comparable evaluation of machine translation between 11 South African languages
This data article describes the Autshumato machine translation evaluation set. The evaluation set contains data that can be used to evaluate machine translation systems between any of the 11 official South African languages. The dataset is parallel with four reference translations available for each...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2020-04-01
|
Series: | Data in Brief |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2352340920300408 |
_version_ | 1818302041890488320 |
---|---|
author | Cindy A. McKellar Martin J. Puttkammer |
author_facet | Cindy A. McKellar Martin J. Puttkammer |
author_sort | Cindy A. McKellar |
collection | DOAJ |
description | This data article describes the Autshumato machine translation evaluation set. The evaluation set contains data that can be used to evaluate machine translation systems between any of the 11 official South African languages. The dataset is parallel with four reference translations available for each of the following languages: Afrikaans, English, isiNdebele, isiXhosa, isiZulu, Sepedi, Sesotho, Setswana, Siswati, Tshivenḓa and Xitsonga. Keywords: Machine translation, Automatic evaluation, Natural language processing, Human language technology |
first_indexed | 2024-12-13T05:32:36Z |
format | Article |
id | doaj.art-6b253a05543c4ea095d0777617096c05 |
institution | Directory Open Access Journal |
issn | 2352-3409 |
language | English |
last_indexed | 2024-12-13T05:32:36Z |
publishDate | 2020-04-01 |
publisher | Elsevier |
record_format | Article |
series | Data in Brief |
spelling | doaj.art-6b253a05543c4ea095d0777617096c052022-12-21T23:58:01ZengElsevierData in Brief2352-34092020-04-0129Dataset for comparable evaluation of machine translation between 11 South African languagesCindy A. McKellar0Martin J. Puttkammer1Corresponding author.; Centre for Text Technology, North-West University, South AfricaCentre for Text Technology, North-West University, South AfricaThis data article describes the Autshumato machine translation evaluation set. The evaluation set contains data that can be used to evaluate machine translation systems between any of the 11 official South African languages. The dataset is parallel with four reference translations available for each of the following languages: Afrikaans, English, isiNdebele, isiXhosa, isiZulu, Sepedi, Sesotho, Setswana, Siswati, Tshivenḓa and Xitsonga. Keywords: Machine translation, Automatic evaluation, Natural language processing, Human language technologyhttp://www.sciencedirect.com/science/article/pii/S2352340920300408 |
spellingShingle | Cindy A. McKellar Martin J. Puttkammer Dataset for comparable evaluation of machine translation between 11 South African languages Data in Brief |
title | Dataset for comparable evaluation of machine translation between 11 South African languages |
title_full | Dataset for comparable evaluation of machine translation between 11 South African languages |
title_fullStr | Dataset for comparable evaluation of machine translation between 11 South African languages |
title_full_unstemmed | Dataset for comparable evaluation of machine translation between 11 South African languages |
title_short | Dataset for comparable evaluation of machine translation between 11 South African languages |
title_sort | dataset for comparable evaluation of machine translation between 11 south african languages |
url | http://www.sciencedirect.com/science/article/pii/S2352340920300408 |
work_keys_str_mv | AT cindyamckellar datasetforcomparableevaluationofmachinetranslationbetween11southafricanlanguages AT martinjputtkammer datasetforcomparableevaluationofmachinetranslationbetween11southafricanlanguages |