A Comprehensive Review of Arabic Text Summarization

The explosion of online and offline data has changed how we gather, evaluate, and understand data. It is frequently difficult and time-consuming to comprehend large text documents and extract crucial information from them. Text summarization techniques address the mentioned problems by compressing l...

Full description

Bibliographic Details
Main Authors:	Asmaa Elsaid, Ammar Mohammed, Lamiaa Fattouh Ibrahim, Mohammed M. Sakre
Format:	Article
Language:	English
Published:	IEEE 2022-01-01
Series:	IEEE Access
Subjects:	Text summarization arabic natural language processing machine learning extractive text summarization abstractive text summarization and deep learning models
Online Access:	https://ieeexplore.ieee.org/document/9745159/

_version_	1818060179328991232
author	Asmaa Elsaid Ammar Mohammed Lamiaa Fattouh Ibrahim Mohammed M. Sakre
author_facet	Asmaa Elsaid Ammar Mohammed Lamiaa Fattouh Ibrahim Mohammed M. Sakre
author_sort	Asmaa Elsaid
collection	DOAJ
description	The explosion of online and offline data has changed how we gather, evaluate, and understand data. It is frequently difficult and time-consuming to comprehend large text documents and extract crucial information from them. Text summarization techniques address the mentioned problems by compressing long texts while retaining their essential contents. These techniques rely on the fast delivery of filtered, high-quality content to their users. Due to the massive amounts of data generated by technology and various sources, automated text summarization of large-scale data is challenging. There are three types of automatic text summarization techniques: extractive, abstractive, and hybrid. Regardless of these previous techniques, the generated summaries are a long way from the summarization produced by human experts. Although Arabic is a widely spoken language that is frequently used for content sharing on the web, Arabic text summarization of Arabic content is limited and still immature because of several problems, including the Arabic language’s morphological structure, the variety of dialects, and the lack of adequate data sources. This paper reviews text summarization approaches and recent deep learning models for this approach. Additionally, it focuses on existing datasets for these approaches, which are also reviewed, along with their characteristics and limitations. The most often used metrics for summarization quality evaluation are ROUGE1, ROUGE2, ROUGE L, and Bleu. The challenges that are encountered during Arabic text summarizing methods and approaches and the solutions proposed in each approach are analyzed. Many Arabic text summarization methods have problems, such as the lack of golden tokens during testing, being out of vocabulary (OOV) words, repeating summary sentences, lack of standard systematic methodologies and architectures, and the complexity of the Arabic language. Finally, providing the required corpora, improving evaluation using semantic representations, the lack of using rouge metrics in abstractive text summarization, and using recent deep learning models to adopt them in Arabic summarization studies is an essential demand.
first_indexed	2024-12-10T13:28:18Z
format	Article
id	doaj.art-3d57ae145e164d3f9b048fecba547463
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-10T13:28:18Z
publishDate	2022-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-3d57ae145e164d3f9b048fecba5474632022-12-22T01:47:05ZengIEEEIEEE Access2169-35362022-01-0110380123803010.1109/ACCESS.2022.31632929745159A Comprehensive Review of Arabic Text SummarizationAsmaa Elsaid0https://orcid.org/0000-0003-0514-1278Ammar Mohammed1https://orcid.org/0000-0001-6844-9451Lamiaa Fattouh Ibrahim2https://orcid.org/0000-0001-5671-8941Mohammed M. Sakre3Department of Computer Science, Faculty of Graduate Studies of Statistical Researches, Cairo University, Giza, EgyptDepartment of Computer Science, Faculty of Graduate Studies of Statistical Researches, Cairo University, Giza, EgyptDepartment of Computer Science, Faculty of Graduate Studies of Statistical Researches, Cairo University, Giza, EgyptHigher Institute of Computer Science and Information Technology, El Shorouk, EgyptThe explosion of online and offline data has changed how we gather, evaluate, and understand data. It is frequently difficult and time-consuming to comprehend large text documents and extract crucial information from them. Text summarization techniques address the mentioned problems by compressing long texts while retaining their essential contents. These techniques rely on the fast delivery of filtered, high-quality content to their users. Due to the massive amounts of data generated by technology and various sources, automated text summarization of large-scale data is challenging. There are three types of automatic text summarization techniques: extractive, abstractive, and hybrid. Regardless of these previous techniques, the generated summaries are a long way from the summarization produced by human experts. Although Arabic is a widely spoken language that is frequently used for content sharing on the web, Arabic text summarization of Arabic content is limited and still immature because of several problems, including the Arabic language’s morphological structure, the variety of dialects, and the lack of adequate data sources. This paper reviews text summarization approaches and recent deep learning models for this approach. Additionally, it focuses on existing datasets for these approaches, which are also reviewed, along with their characteristics and limitations. The most often used metrics for summarization quality evaluation are ROUGE1, ROUGE2, ROUGE L, and Bleu. The challenges that are encountered during Arabic text summarizing methods and approaches and the solutions proposed in each approach are analyzed. Many Arabic text summarization methods have problems, such as the lack of golden tokens during testing, being out of vocabulary (OOV) words, repeating summary sentences, lack of standard systematic methodologies and architectures, and the complexity of the Arabic language. Finally, providing the required corpora, improving evaluation using semantic representations, the lack of using rouge metrics in abstractive text summarization, and using recent deep learning models to adopt them in Arabic summarization studies is an essential demand.https://ieeexplore.ieee.org/document/9745159/Text summarizationarabic natural language processingmachine learningextractive text summarizationabstractive text summarizationand deep learning models
spellingShingle	Asmaa Elsaid Ammar Mohammed Lamiaa Fattouh Ibrahim Mohammed M. Sakre A Comprehensive Review of Arabic Text Summarization IEEE Access Text summarization arabic natural language processing machine learning extractive text summarization abstractive text summarization and deep learning models
title	A Comprehensive Review of Arabic Text Summarization
title_full	A Comprehensive Review of Arabic Text Summarization
title_fullStr	A Comprehensive Review of Arabic Text Summarization
title_full_unstemmed	A Comprehensive Review of Arabic Text Summarization
title_short	A Comprehensive Review of Arabic Text Summarization
title_sort	comprehensive review of arabic text summarization
topic	Text summarization arabic natural language processing machine learning extractive text summarization abstractive text summarization and deep learning models
url	https://ieeexplore.ieee.org/document/9745159/
work_keys_str_mv	AT asmaaelsaid acomprehensivereviewofarabictextsummarization AT ammarmohammed acomprehensivereviewofarabictextsummarization AT lamiaafattouhibrahim acomprehensivereviewofarabictextsummarization AT mohammedmsakre acomprehensivereviewofarabictextsummarization AT asmaaelsaid comprehensivereviewofarabictextsummarization AT ammarmohammed comprehensivereviewofarabictextsummarization AT lamiaafattouhibrahim comprehensivereviewofarabictextsummarization AT mohammedmsakre comprehensivereviewofarabictextsummarization

A Comprehensive Review of Arabic Text Summarization

Similar Items