Natural Language Processing For Automatic text summarization [Datasets] - Survey

Natural language processing has developed significantly recently, which has progressed the text summarization task. It is no longer limited to reducing the text size or obtaining helpful information from a long document only. It has begun to be used in getting answers from summarization, measuring t...

ver descrição completa

Detalhes bibliográficos
Principais autores: Alaa Ahmed AL-Banna, Abeer K. AL-Mashhadany
Formato: Artigo
Idioma:English
Publicado em: College of Computer and Information Technology – University of Wasit, Iraq 2022-12-01
coleção:Wasit Journal of Computer and Mathematics Science
Assuntos:
Acesso em linha:https://wjcm.uowasit.edu.iq/index.php/wjcm/article/view/72
Descrição
Resumo:Natural language processing has developed significantly recently, which has progressed the text summarization task. It is no longer limited to reducing the text size or obtaining helpful information from a long document only. It has begun to be used in getting answers from summarization, measuring the quality of sentiment analysis systems, research and mining techniques, document categorization, and natural language Inference, which increased the importance of scientific research to get a good summary. This paper reviews the most used datasets in text summarization in different languages and types, with the most effective methods for each dataset. The results are shown using text summarization matrices. The review indicates that the pre-training models achieved the highest results in the summary measures in most of the researchers' works for the datasets. Dataset English made up about 75% of the databases available to researchers due to the extensive use of the English language. Other languages such as Arabic, Hindi, and others suffered from low resources of dataset sources, which limited progress in the academic field.
ISSN:2788-5879
2788-5887