Natural Language Processing For Automatic text summarization [Datasets] - Survey
Natural language processing has developed significantly recently, which has progressed the text summarization task. It is no longer limited to reducing the text size or obtaining helpful information from a long document only. It has begun to be used in getting answers from summarization, measuring t...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
College of Computer and Information Technology – University of Wasit, Iraq
2022-12-01
|
Series: | Wasit Journal of Computer and Mathematics Science |
Subjects: | |
Online Access: | https://wjcm.uowasit.edu.iq/index.php/wjcm/article/view/72 |
_version_ | 1827276852285145088 |
---|---|
author | Alaa Ahmed AL-Banna Abeer K. AL-Mashhadany |
author_facet | Alaa Ahmed AL-Banna Abeer K. AL-Mashhadany |
author_sort | Alaa Ahmed AL-Banna |
collection | DOAJ |
description | Natural language processing has developed significantly recently, which has progressed the text summarization task. It is no longer limited to reducing the text size or obtaining helpful information from a long document only. It has begun to be used in getting answers from summarization, measuring the quality of sentiment analysis systems, research and mining techniques, document categorization, and natural language Inference, which increased the importance of scientific research to get a good summary. This paper reviews the most used datasets in text summarization in different languages and types, with the most effective methods for each dataset. The results are shown using text summarization matrices. The review indicates that the pre-training models achieved the highest results in the summary measures in most of the researchers' works for the datasets. Dataset English made up about 75% of the databases available to researchers due to the extensive use of the English language. Other languages such as Arabic, Hindi, and others suffered from low resources of dataset sources, which limited progress in the academic field.
|
first_indexed | 2024-03-07T19:03:36Z |
format | Article |
id | doaj.art-a086f934a7a34d40a3651446edd05fd0 |
institution | Directory Open Access Journal |
issn | 2788-5879 2788-5887 |
language | English |
last_indexed | 2024-04-24T07:07:51Z |
publishDate | 2022-12-01 |
publisher | College of Computer and Information Technology – University of Wasit, Iraq |
record_format | Article |
series | Wasit Journal of Computer and Mathematics Science |
spelling | doaj.art-a086f934a7a34d40a3651446edd05fd02024-04-21T18:57:29ZengCollege of Computer and Information Technology – University of Wasit, IraqWasit Journal of Computer and Mathematics Science2788-58792788-58872022-12-011410.31185/wjcm.72Natural Language Processing For Automatic text summarization [Datasets] - SurveyAlaa Ahmed AL-Banna 0Abeer K. AL-Mashhadany1Department of Computer Science, College of Science ,Al-Nahrain University, Baghdad, IraqDepartment of Computer Science, College of Science ,Al-Nahrain University, Baghdad, IraqNatural language processing has developed significantly recently, which has progressed the text summarization task. It is no longer limited to reducing the text size or obtaining helpful information from a long document only. It has begun to be used in getting answers from summarization, measuring the quality of sentiment analysis systems, research and mining techniques, document categorization, and natural language Inference, which increased the importance of scientific research to get a good summary. This paper reviews the most used datasets in text summarization in different languages and types, with the most effective methods for each dataset. The results are shown using text summarization matrices. The review indicates that the pre-training models achieved the highest results in the summary measures in most of the researchers' works for the datasets. Dataset English made up about 75% of the databases available to researchers due to the extensive use of the English language. Other languages such as Arabic, Hindi, and others suffered from low resources of dataset sources, which limited progress in the academic field. https://wjcm.uowasit.edu.iq/index.php/wjcm/article/view/72Natural Language ProcessingAutomatic Text SummarizationAbstractive Text SummarizationExtractive Text SummarizationText Summarization Datasets |
spellingShingle | Alaa Ahmed AL-Banna Abeer K. AL-Mashhadany Natural Language Processing For Automatic text summarization [Datasets] - Survey Wasit Journal of Computer and Mathematics Science Natural Language Processing Automatic Text Summarization Abstractive Text Summarization Extractive Text Summarization Text Summarization Datasets |
title | Natural Language Processing For Automatic text summarization [Datasets] - Survey |
title_full | Natural Language Processing For Automatic text summarization [Datasets] - Survey |
title_fullStr | Natural Language Processing For Automatic text summarization [Datasets] - Survey |
title_full_unstemmed | Natural Language Processing For Automatic text summarization [Datasets] - Survey |
title_short | Natural Language Processing For Automatic text summarization [Datasets] - Survey |
title_sort | natural language processing for automatic text summarization datasets survey |
topic | Natural Language Processing Automatic Text Summarization Abstractive Text Summarization Extractive Text Summarization Text Summarization Datasets |
url | https://wjcm.uowasit.edu.iq/index.php/wjcm/article/view/72 |
work_keys_str_mv | AT alaaahmedalbanna naturallanguageprocessingforautomatictextsummarizationdatasetssurvey AT abeerkalmashhadany naturallanguageprocessingforautomatictextsummarizationdatasetssurvey |