Natural Language Processing For Automatic text summarization [Datasets] - Survey

Natural language processing has developed significantly recently, which has progressed the text summarization task. It is no longer limited to reducing the text size or obtaining helpful information from a long document only. It has begun to be used in getting answers from summarization, measuring t...

Full description

Bibliographic Details
Main Authors: Alaa Ahmed AL-Banna, Abeer K. AL-Mashhadany
Format: Article
Language:English
Published: College of Computer and Information Technology – University of Wasit, Iraq 2022-12-01
Series:Wasit Journal of Computer and Mathematics Science
Subjects:
Online Access:https://wjcm.uowasit.edu.iq/index.php/wjcm/article/view/72
_version_ 1827276852285145088
author Alaa Ahmed AL-Banna
Abeer K. AL-Mashhadany
author_facet Alaa Ahmed AL-Banna
Abeer K. AL-Mashhadany
author_sort Alaa Ahmed AL-Banna
collection DOAJ
description Natural language processing has developed significantly recently, which has progressed the text summarization task. It is no longer limited to reducing the text size or obtaining helpful information from a long document only. It has begun to be used in getting answers from summarization, measuring the quality of sentiment analysis systems, research and mining techniques, document categorization, and natural language Inference, which increased the importance of scientific research to get a good summary. This paper reviews the most used datasets in text summarization in different languages and types, with the most effective methods for each dataset. The results are shown using text summarization matrices. The review indicates that the pre-training models achieved the highest results in the summary measures in most of the researchers' works for the datasets. Dataset English made up about 75% of the databases available to researchers due to the extensive use of the English language. Other languages such as Arabic, Hindi, and others suffered from low resources of dataset sources, which limited progress in the academic field.
first_indexed 2024-03-07T19:03:36Z
format Article
id doaj.art-a086f934a7a34d40a3651446edd05fd0
institution Directory Open Access Journal
issn 2788-5879
2788-5887
language English
last_indexed 2024-04-24T07:07:51Z
publishDate 2022-12-01
publisher College of Computer and Information Technology – University of Wasit, Iraq
record_format Article
series Wasit Journal of Computer and Mathematics Science
spelling doaj.art-a086f934a7a34d40a3651446edd05fd02024-04-21T18:57:29ZengCollege of Computer and Information Technology – University of Wasit, IraqWasit Journal of Computer and Mathematics Science2788-58792788-58872022-12-011410.31185/wjcm.72Natural Language Processing For Automatic text summarization [Datasets] - SurveyAlaa Ahmed AL-Banna 0Abeer K. AL-Mashhadany1Department of Computer Science, College of Science ,Al-Nahrain University, Baghdad, IraqDepartment of Computer Science, College of Science ,Al-Nahrain University, Baghdad, IraqNatural language processing has developed significantly recently, which has progressed the text summarization task. It is no longer limited to reducing the text size or obtaining helpful information from a long document only. It has begun to be used in getting answers from summarization, measuring the quality of sentiment analysis systems, research and mining techniques, document categorization, and natural language Inference, which increased the importance of scientific research to get a good summary. This paper reviews the most used datasets in text summarization in different languages and types, with the most effective methods for each dataset. The results are shown using text summarization matrices. The review indicates that the pre-training models achieved the highest results in the summary measures in most of the researchers' works for the datasets. Dataset English made up about 75% of the databases available to researchers due to the extensive use of the English language. Other languages such as Arabic, Hindi, and others suffered from low resources of dataset sources, which limited progress in the academic field. https://wjcm.uowasit.edu.iq/index.php/wjcm/article/view/72Natural Language ProcessingAutomatic Text SummarizationAbstractive Text SummarizationExtractive Text SummarizationText Summarization Datasets
spellingShingle Alaa Ahmed AL-Banna
Abeer K. AL-Mashhadany
Natural Language Processing For Automatic text summarization [Datasets] - Survey
Wasit Journal of Computer and Mathematics Science
Natural Language Processing
Automatic Text Summarization
Abstractive Text Summarization
Extractive Text Summarization
Text Summarization Datasets
title Natural Language Processing For Automatic text summarization [Datasets] - Survey
title_full Natural Language Processing For Automatic text summarization [Datasets] - Survey
title_fullStr Natural Language Processing For Automatic text summarization [Datasets] - Survey
title_full_unstemmed Natural Language Processing For Automatic text summarization [Datasets] - Survey
title_short Natural Language Processing For Automatic text summarization [Datasets] - Survey
title_sort natural language processing for automatic text summarization datasets survey
topic Natural Language Processing
Automatic Text Summarization
Abstractive Text Summarization
Extractive Text Summarization
Text Summarization Datasets
url https://wjcm.uowasit.edu.iq/index.php/wjcm/article/view/72
work_keys_str_mv AT alaaahmedalbanna naturallanguageprocessingforautomatictextsummarizationdatasetssurvey
AT abeerkalmashhadany naturallanguageprocessingforautomatictextsummarizationdatasetssurvey