A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification
Sarcasm identification on text documents is one of the most challenging tasks in natural language processing (NLP), has become an essential research direction, due to its prevalence on social media data. The purpose of our research is to present an effective sarcasm identification framework on socia...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9316208/ |
_version_ | 1818662682355564544 |
---|---|
author | Aytug Onan Mansur Alp Tocoglu |
author_facet | Aytug Onan Mansur Alp Tocoglu |
author_sort | Aytug Onan |
collection | DOAJ |
description | Sarcasm identification on text documents is one of the most challenging tasks in natural language processing (NLP), has become an essential research direction, due to its prevalence on social media data. The purpose of our research is to present an effective sarcasm identification framework on social media data by pursuing the paradigms of neural language models and deep neural networks. To represent text documents, we introduce inverse gravity moment based term weighted word embedding model with trigrams. In this way, critical words/terms have higher values by keeping the word-ordering information. In our model, we present a three-layer stacked bidirectional long short-term memory architecture to identify sarcastic text documents. For the evaluation task, the presented framework has been evaluated on three-sarcasm identification corpus. In the empirical analysis, three neural language models (i.e., word2vec, fastText and GloVe), two unsupervised term weighting functions (i.e., term-frequency, and TF-IDF) and eight supervised term weighting functions (i.e., odds ratio, relevance frequency, balanced distributional concentration, inverse question frequency-question frequency-inverse category frequency, short text weighting, inverse gravity moment, regularized entropy and inverse false negative-true positive-inverse category frequency) have been evaluated. For sarcasm identification task, the presented model yields promising results with a classification accuracy of 95.30%. |
first_indexed | 2024-12-17T05:04:50Z |
format | Article |
id | doaj.art-3a0cba0bba3b44b186a1a2bc9a2576e1 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-17T05:04:50Z |
publishDate | 2021-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-3a0cba0bba3b44b186a1a2bc9a2576e12022-12-21T22:02:28ZengIEEEIEEE Access2169-35362021-01-0197701772210.1109/ACCESS.2021.30497349316208A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm IdentificationAytug Onan0https://orcid.org/0000-0002-9434-5880Mansur Alp Tocoglu1https://orcid.org/0000-0003-1784-9003Department of Computer Engineering, Faculty of Engineering and Architecture, İzmir Katip Çelebi University, İzmir, TurkeyDepartment of Software Engineering, Faculty of Technology, Manisa Celal Bayar University, Manisa, TurkeySarcasm identification on text documents is one of the most challenging tasks in natural language processing (NLP), has become an essential research direction, due to its prevalence on social media data. The purpose of our research is to present an effective sarcasm identification framework on social media data by pursuing the paradigms of neural language models and deep neural networks. To represent text documents, we introduce inverse gravity moment based term weighted word embedding model with trigrams. In this way, critical words/terms have higher values by keeping the word-ordering information. In our model, we present a three-layer stacked bidirectional long short-term memory architecture to identify sarcastic text documents. For the evaluation task, the presented framework has been evaluated on three-sarcasm identification corpus. In the empirical analysis, three neural language models (i.e., word2vec, fastText and GloVe), two unsupervised term weighting functions (i.e., term-frequency, and TF-IDF) and eight supervised term weighting functions (i.e., odds ratio, relevance frequency, balanced distributional concentration, inverse question frequency-question frequency-inverse category frequency, short text weighting, inverse gravity moment, regularized entropy and inverse false negative-true positive-inverse category frequency) have been evaluated. For sarcasm identification task, the presented model yields promising results with a classification accuracy of 95.30%.https://ieeexplore.ieee.org/document/9316208/Sarcasm identificationterm weightingneural language modelbidirectional long shortterm memory |
spellingShingle | Aytug Onan Mansur Alp Tocoglu A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification IEEE Access Sarcasm identification term weighting neural language model bidirectional long shortterm memory |
title | A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification |
title_full | A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification |
title_fullStr | A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification |
title_full_unstemmed | A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification |
title_short | A Term Weighted Neural Language Model and Stacked Bidirectional LSTM Based Framework for Sarcasm Identification |
title_sort | term weighted neural language model and stacked bidirectional lstm based framework for sarcasm identification |
topic | Sarcasm identification term weighting neural language model bidirectional long shortterm memory |
url | https://ieeexplore.ieee.org/document/9316208/ |
work_keys_str_mv | AT aytugonan atermweightedneurallanguagemodelandstackedbidirectionallstmbasedframeworkforsarcasmidentification AT mansuralptocoglu atermweightedneurallanguagemodelandstackedbidirectionallstmbasedframeworkforsarcasmidentification AT aytugonan termweightedneurallanguagemodelandstackedbidirectionallstmbasedframeworkforsarcasmidentification AT mansuralptocoglu termweightedneurallanguagemodelandstackedbidirectionallstmbasedframeworkforsarcasmidentification |