A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification

Sarcasm identification on text documents is one of the most challenging tasks in natural language processing (NLP), has become an essential research direction, due to its prevalence on social media data. The purpose of our research is to present an effective sarcasm identification framework on socia...

Full description

Bibliographic Details
Main Authors: Aytug Onan, Mansur Alp Tocoglu
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9316208/
_version_ 1818662682355564544
author Aytug Onan
Mansur Alp Tocoglu
author_facet Aytug Onan
Mansur Alp Tocoglu
author_sort Aytug Onan
collection DOAJ
description Sarcasm identification on text documents is one of the most challenging tasks in natural language processing (NLP), has become an essential research direction, due to its prevalence on social media data. The purpose of our research is to present an effective sarcasm identification framework on social media data by pursuing the paradigms of neural language models and deep neural networks. To represent text documents, we introduce inverse gravity moment based term weighted word embedding model with trigrams. In this way, critical words/terms have higher values by keeping the word-ordering information. In our model, we present a three-layer stacked bidirectional long short-term memory architecture to identify sarcastic text documents. For the evaluation task, the presented framework has been evaluated on three-sarcasm identification corpus. In the empirical analysis, three neural language models (i.e., word2vec, fastText and GloVe), two unsupervised term weighting functions (i.e., term-frequency, and TF-IDF) and eight supervised term weighting functions (i.e., odds ratio, relevance frequency, balanced distributional concentration, inverse question frequency-question frequency-inverse category frequency, short text weighting, inverse gravity moment, regularized entropy and inverse false negative-true positive-inverse category frequency) have been evaluated. For sarcasm identification task, the presented model yields promising results with a classification accuracy of 95.30%.
first_indexed 2024-12-17T05:04:50Z
format Article
id doaj.art-3a0cba0bba3b44b186a1a2bc9a2576e1
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-17T05:04:50Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-3a0cba0bba3b44b186a1a2bc9a2576e12022-12-21T22:02:28ZengIEEEIEEE Access2169-35362021-01-0197701772210.1109/ACCESS.2021.30497349316208A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm IdentificationAytug Onan0https://orcid.org/0000-0002-9434-5880Mansur Alp Tocoglu1https://orcid.org/0000-0003-1784-9003Department of Computer Engineering, Faculty of Engineering and Architecture, İzmir Katip Çelebi University, İzmir, TurkeyDepartment of Software Engineering, Faculty of Technology, Manisa Celal Bayar University, Manisa, TurkeySarcasm identification on text documents is one of the most challenging tasks in natural language processing (NLP), has become an essential research direction, due to its prevalence on social media data. The purpose of our research is to present an effective sarcasm identification framework on social media data by pursuing the paradigms of neural language models and deep neural networks. To represent text documents, we introduce inverse gravity moment based term weighted word embedding model with trigrams. In this way, critical words/terms have higher values by keeping the word-ordering information. In our model, we present a three-layer stacked bidirectional long short-term memory architecture to identify sarcastic text documents. For the evaluation task, the presented framework has been evaluated on three-sarcasm identification corpus. In the empirical analysis, three neural language models (i.e., word2vec, fastText and GloVe), two unsupervised term weighting functions (i.e., term-frequency, and TF-IDF) and eight supervised term weighting functions (i.e., odds ratio, relevance frequency, balanced distributional concentration, inverse question frequency-question frequency-inverse category frequency, short text weighting, inverse gravity moment, regularized entropy and inverse false negative-true positive-inverse category frequency) have been evaluated. For sarcasm identification task, the presented model yields promising results with a classification accuracy of 95.30%.https://ieeexplore.ieee.org/document/9316208/Sarcasm identificationterm weightingneural language modelbidirectional long shortterm memory
spellingShingle Aytug Onan
Mansur Alp Tocoglu
A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification
IEEE Access
Sarcasm identification
term weighting
neural language model
bidirectional long shortterm memory
title A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification
title_full A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification
title_fullStr A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification
title_full_unstemmed A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification
title_short A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification
title_sort term weighted neural language model and stacked bidirectional lstm based framework for sarcasm identification
topic Sarcasm identification
term weighting
neural language model
bidirectional long shortterm memory
url https://ieeexplore.ieee.org/document/9316208/
work_keys_str_mv AT aytugonan atermweightedneurallanguagemodelandstackedbidirectionallstmbasedframeworkforsarcasmidentification
AT mansuralptocoglu atermweightedneurallanguagemodelandstackedbidirectionallstmbasedframeworkforsarcasmidentification
AT aytugonan termweightedneurallanguagemodelandstackedbidirectionallstmbasedframeworkforsarcasmidentification
AT mansuralptocoglu termweightedneurallanguagemodelandstackedbidirectionallstmbasedframeworkforsarcasmidentification