A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark Datasets

Machine Reading Comprehension (MRC) is a challenging Natural Language Processing (NLP) research field with wide real-world applications. The great progress of this field in recent years is mainly due to the emergence of large-scale datasets and deep learning. At present, a lot of MRC models have alr...

Full description

Bibliographic Details
Main Authors: Changchang Zeng, Shaobo Li, Qin Li, Jie Hu, Jianjun Hu
Format: Article
Language:English
Published: MDPI AG 2020-10-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/10/21/7640
_version_ 1797549432851398656
author Changchang Zeng
Shaobo Li
Qin Li
Jie Hu
Jianjun Hu
author_facet Changchang Zeng
Shaobo Li
Qin Li
Jie Hu
Jianjun Hu
author_sort Changchang Zeng
collection DOAJ
description Machine Reading Comprehension (MRC) is a challenging Natural Language Processing (NLP) research field with wide real-world applications. The great progress of this field in recent years is mainly due to the emergence of large-scale datasets and deep learning. At present, a lot of MRC models have already surpassed human performance on various benchmark datasets despite the obvious giant gap between existing MRC models and genuine human-level reading comprehension. This shows the need for improving existing datasets, evaluation metrics, and models to move current MRC models toward “real” understanding. To address the current lack of comprehensive survey of existing MRC tasks, evaluation metrics, and datasets, herein, (1) we analyze 57 MRC tasks and datasets and propose a more precise classification method of MRC tasks with 4 different attributes; (2) we summarized 9 evaluation metrics of MRC tasks, 7 attributes and 10 characteristics of MRC datasets; (3) We also discuss key open issues in MRC research and highlighted future research directions. In addition, we have collected, organized, and published our data on the companion website where MRC researchers could directly access each MRC dataset, papers, baseline projects, and the leaderboard.
first_indexed 2024-03-10T15:14:38Z
format Article
id doaj.art-2d9ff69284824e0587389865839947b4
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T15:14:38Z
publishDate 2020-10-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-2d9ff69284824e0587389865839947b42023-11-20T19:01:32ZengMDPI AGApplied Sciences2076-34172020-10-011021764010.3390/app10217640A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark DatasetsChangchang Zeng0Shaobo Li1Qin Li2Jie Hu3Jianjun Hu4Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610041, ChinaChengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610041, ChinaCollege of Big Data Statistics, GuiZhou University of Finance and Economics, Guiyang 550025, ChinaCollege of Big Data Statistics, GuiZhou University of Finance and Economics, Guiyang 550025, ChinaDepartment of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USAMachine Reading Comprehension (MRC) is a challenging Natural Language Processing (NLP) research field with wide real-world applications. The great progress of this field in recent years is mainly due to the emergence of large-scale datasets and deep learning. At present, a lot of MRC models have already surpassed human performance on various benchmark datasets despite the obvious giant gap between existing MRC models and genuine human-level reading comprehension. This shows the need for improving existing datasets, evaluation metrics, and models to move current MRC models toward “real” understanding. To address the current lack of comprehensive survey of existing MRC tasks, evaluation metrics, and datasets, herein, (1) we analyze 57 MRC tasks and datasets and propose a more precise classification method of MRC tasks with 4 different attributes; (2) we summarized 9 evaluation metrics of MRC tasks, 7 attributes and 10 characteristics of MRC datasets; (3) We also discuss key open issues in MRC research and highlighted future research directions. In addition, we have collected, organized, and published our data on the companion website where MRC researchers could directly access each MRC dataset, papers, baseline projects, and the leaderboard.https://www.mdpi.com/2076-3417/10/21/7640machine reading comprehensionsurveydataset
spellingShingle Changchang Zeng
Shaobo Li
Qin Li
Jie Hu
Jianjun Hu
A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark Datasets
Applied Sciences
machine reading comprehension
survey
dataset
title A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark Datasets
title_full A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark Datasets
title_fullStr A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark Datasets
title_full_unstemmed A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark Datasets
title_short A Survey on Machine Reading Comprehension—Tasks, Evaluation Metrics and Benchmark Datasets
title_sort survey on machine reading comprehension tasks evaluation metrics and benchmark datasets
topic machine reading comprehension
survey
dataset
url https://www.mdpi.com/2076-3417/10/21/7640
work_keys_str_mv AT changchangzeng asurveyonmachinereadingcomprehensiontasksevaluationmetricsandbenchmarkdatasets
AT shaoboli asurveyonmachinereadingcomprehensiontasksevaluationmetricsandbenchmarkdatasets
AT qinli asurveyonmachinereadingcomprehensiontasksevaluationmetricsandbenchmarkdatasets
AT jiehu asurveyonmachinereadingcomprehensiontasksevaluationmetricsandbenchmarkdatasets
AT jianjunhu asurveyonmachinereadingcomprehensiontasksevaluationmetricsandbenchmarkdatasets
AT changchangzeng surveyonmachinereadingcomprehensiontasksevaluationmetricsandbenchmarkdatasets
AT shaoboli surveyonmachinereadingcomprehensiontasksevaluationmetricsandbenchmarkdatasets
AT qinli surveyonmachinereadingcomprehensiontasksevaluationmetricsandbenchmarkdatasets
AT jiehu surveyonmachinereadingcomprehensiontasksevaluationmetricsandbenchmarkdatasets
AT jianjunhu surveyonmachinereadingcomprehensiontasksevaluationmetricsandbenchmarkdatasets