Arabic Aspect Extraction Based on Stacked Contextualized Embedding With Deep Learning

The exponential growth of the internet and a multi-fold increase in social media users in the last decade have resulted in a massive growth of unstructured data. Aspect-Based Sentiment Analysis (ABSA) is challenging because it performs a fine-grain analysis; it is a text analysis technique where the...

Full description

Bibliographic Details
Main Authors: Arwa Saif Fadel, Mostafa Elsayed Saleh, Osama Ahmed Abulnaja
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9733905/
_version_ 1818357062724222976
author Arwa Saif Fadel
Mostafa Elsayed Saleh
Osama Ahmed Abulnaja
author_facet Arwa Saif Fadel
Mostafa Elsayed Saleh
Osama Ahmed Abulnaja
author_sort Arwa Saif Fadel
collection DOAJ
description The exponential growth of the internet and a multi-fold increase in social media users in the last decade have resulted in a massive growth of unstructured data. Aspect-Based Sentiment Analysis (ABSA) is challenging because it performs a fine-grain analysis; it is a text analysis technique where the opinions group is based on the aspect. The Aspect Extraction (AE) task is one of the core subtasks of ABSA; it helps to identify aspect terms in the text, comments, or reviews. The challenge of the Arabic AE task increases due to the complexity of the Arabic language. This work aims to develop the Arabic AE task by proposing transfer learning using state-of-art pre-trained contextual language models. We concatenate the Bidirectional Encoder Representation from Transformers (BERT) language model and contextualize string embeddings (Flair embedding) as a stacked embeddings layer for better word representation for Arabic language. Then, we extend it with different deep learning network architectures. For Arabic AE, the model is developed by concatenating the Arabic contextual language model, AraBERT, and Flair embedding as a contextual stacked embeddings layer with an extended layer, BiLSTM-CRF or BiGRU-CRF, for sequence labeling. Our proposed models are called BF-BiLSTM-CRF and BF-BiGRU-CRF. The proposed model is evaluated using the Arabic Hotel’s reviews dataset. For performance evaluation, we used the F1 score. The experimental results show that the proposed BF-BiLSTM-CRF configuration outperformed the baseline and other models by achieving an F1score of 79.7%.
first_indexed 2024-12-13T20:07:08Z
format Article
id doaj.art-0a1aa0b140ab4d91a445a95b3ad13d70
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-13T20:07:08Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-0a1aa0b140ab4d91a445a95b3ad13d702022-12-21T23:33:00ZengIEEEIEEE Access2169-35362022-01-0110305263053510.1109/ACCESS.2022.31592529733905Arabic Aspect Extraction Based on Stacked Contextualized Embedding With Deep LearningArwa Saif Fadel0https://orcid.org/0000-0001-6663-2359Mostafa Elsayed Saleh1Osama Ahmed Abulnaja2https://orcid.org/0000-0003-3431-6890Computer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University (KAU), Jeddah, Saudi ArabiaComputer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University (KAU), Jeddah, Saudi ArabiaComputer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University (KAU), Jeddah, Saudi ArabiaThe exponential growth of the internet and a multi-fold increase in social media users in the last decade have resulted in a massive growth of unstructured data. Aspect-Based Sentiment Analysis (ABSA) is challenging because it performs a fine-grain analysis; it is a text analysis technique where the opinions group is based on the aspect. The Aspect Extraction (AE) task is one of the core subtasks of ABSA; it helps to identify aspect terms in the text, comments, or reviews. The challenge of the Arabic AE task increases due to the complexity of the Arabic language. This work aims to develop the Arabic AE task by proposing transfer learning using state-of-art pre-trained contextual language models. We concatenate the Bidirectional Encoder Representation from Transformers (BERT) language model and contextualize string embeddings (Flair embedding) as a stacked embeddings layer for better word representation for Arabic language. Then, we extend it with different deep learning network architectures. For Arabic AE, the model is developed by concatenating the Arabic contextual language model, AraBERT, and Flair embedding as a contextual stacked embeddings layer with an extended layer, BiLSTM-CRF or BiGRU-CRF, for sequence labeling. Our proposed models are called BF-BiLSTM-CRF and BF-BiGRU-CRF. The proposed model is evaluated using the Arabic Hotel’s reviews dataset. For performance evaluation, we used the F1 score. The experimental results show that the proposed BF-BiLSTM-CRF configuration outperformed the baseline and other models by achieving an F1score of 79.7%.https://ieeexplore.ieee.org/document/9733905/Arabic aspect extractionAraBERTBERTflair embeddingBiLSTMBiGRU
spellingShingle Arwa Saif Fadel
Mostafa Elsayed Saleh
Osama Ahmed Abulnaja
Arabic Aspect Extraction Based on Stacked Contextualized Embedding With Deep Learning
IEEE Access
Arabic aspect extraction
AraBERT
BERT
flair embedding
BiLSTM
BiGRU
title Arabic Aspect Extraction Based on Stacked Contextualized Embedding With Deep Learning
title_full Arabic Aspect Extraction Based on Stacked Contextualized Embedding With Deep Learning
title_fullStr Arabic Aspect Extraction Based on Stacked Contextualized Embedding With Deep Learning
title_full_unstemmed Arabic Aspect Extraction Based on Stacked Contextualized Embedding With Deep Learning
title_short Arabic Aspect Extraction Based on Stacked Contextualized Embedding With Deep Learning
title_sort arabic aspect extraction based on stacked contextualized embedding with deep learning
topic Arabic aspect extraction
AraBERT
BERT
flair embedding
BiLSTM
BiGRU
url https://ieeexplore.ieee.org/document/9733905/
work_keys_str_mv AT arwasaiffadel arabicaspectextractionbasedonstackedcontextualizedembeddingwithdeeplearning
AT mostafaelsayedsaleh arabicaspectextractionbasedonstackedcontextualizedembeddingwithdeeplearning
AT osamaahmedabulnaja arabicaspectextractionbasedonstackedcontextualizedembeddingwithdeeplearning