Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter Data

Sentiment analysis using stemmed Twitter data from various languages is an emerging research topic. In this paper, we address three data augmentation techniques namely Shift, Shuffle, and Hybrid to increase the size of the training data; and then we use three key types of deep learning (DL) models n...

Full description

Bibliographic Details
Main Authors: Harisu Abdullahi Shehu, Md. Haidar Sharif, Md. Haris Uddin Sharif, Ripon Datta, Sezai Tokat, Sahin Uyaver, Huseyin Kusetogullari, Rabie A. Ramadan
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9395633/
_version_ 1819315661846872064
author Harisu Abdullahi Shehu
Md. Haidar Sharif
Md. Haris Uddin Sharif
Ripon Datta
Sezai Tokat
Sahin Uyaver
Huseyin Kusetogullari
Rabie A. Ramadan
author_facet Harisu Abdullahi Shehu
Md. Haidar Sharif
Md. Haris Uddin Sharif
Ripon Datta
Sezai Tokat
Sahin Uyaver
Huseyin Kusetogullari
Rabie A. Ramadan
author_sort Harisu Abdullahi Shehu
collection DOAJ
description Sentiment analysis using stemmed Twitter data from various languages is an emerging research topic. In this paper, we address three data augmentation techniques namely Shift, Shuffle, and Hybrid to increase the size of the training data; and then we use three key types of deep learning (DL) models namely recurrent neural network (RNN), convolution neural network (CNN), and hierarchical attention network (HAN) to classify the stemmed Turkish Twitter data for sentiment analysis. The performance of these DL models has been compared with the existing traditional machine learning (TML) models. The performance of TML models has been affected negatively by the stemmed data, but the performance of DL models has been improved greatly with the utilization of the augmentation techniques. Based on the simulation, experimental, and statistical results analysis deeming identical datasets, it has been concluded that the TML models outperform the DL models with respect to both training-time (<italic>TTM</italic>) and runtime (<italic>RTM</italic>) complexities of the algorithms; but the DL models outperform the TML models with respect to the most important performance factors as well as the average performance rankings.
first_indexed 2024-12-24T10:03:40Z
format Article
id doaj.art-a414a5901aac42dd9d9c1d3e45ea8be3
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-24T10:03:40Z
publishDate 2021-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-a414a5901aac42dd9d9c1d3e45ea8be32022-12-21T17:00:54ZengIEEEIEEE Access2169-35362021-01-019568365685410.1109/ACCESS.2021.30713939395633Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter DataHarisu Abdullahi Shehu0https://orcid.org/0000-0002-9689-3290Md. Haidar Sharif1https://orcid.org/0000-0001-7235-6004Md. Haris Uddin Sharif2https://orcid.org/0000-0002-1169-8438Ripon Datta3https://orcid.org/0000-0003-4738-2918Sezai Tokat4https://orcid.org/0000-0003-0193-8220Sahin Uyaver5https://orcid.org/0000-0001-8776-3032Huseyin Kusetogullari6https://orcid.org/0000-0001-5762-6678Rabie A. Ramadan7https://orcid.org/0000-0002-0281-9381School of Engineering and Computer Science, Victoria University of Wellington, Wellington, New ZealandCollege of Computer Science and Engineering, University of Hail, Hail, Saudi ArabiaDepartment of International Graduate Services, University of the Cumberlands, Williamsburg, KY, USADepartment of International Graduate Services, University of the Cumberlands, Williamsburg, KY, USADepartment of Computer Engineering, Pamukkale University, Denizli, TurkeyDepartment of Energy Science and Technologies, Turkish-German University, Istanbul, TurkeyDepartment of Computer Science, Blekinge Institute of Technology, Karlskrona, SwedenComputer Engineering Department, College of Engineering, Cairo University, Cairo, EgyptSentiment analysis using stemmed Twitter data from various languages is an emerging research topic. In this paper, we address three data augmentation techniques namely Shift, Shuffle, and Hybrid to increase the size of the training data; and then we use three key types of deep learning (DL) models namely recurrent neural network (RNN), convolution neural network (CNN), and hierarchical attention network (HAN) to classify the stemmed Turkish Twitter data for sentiment analysis. The performance of these DL models has been compared with the existing traditional machine learning (TML) models. The performance of TML models has been affected negatively by the stemmed data, but the performance of DL models has been improved greatly with the utilization of the augmentation techniques. Based on the simulation, experimental, and statistical results analysis deeming identical datasets, it has been concluded that the TML models outperform the DL models with respect to both training-time (<italic>TTM</italic>) and runtime (<italic>RTM</italic>) complexities of the algorithms; but the DL models outperform the TML models with respect to the most important performance factors as well as the average performance rankings.https://ieeexplore.ieee.org/document/9395633/Data augmentationdeep learningmachine learningneural networkssentiment analysisTurkish
spellingShingle Harisu Abdullahi Shehu
Md. Haidar Sharif
Md. Haris Uddin Sharif
Ripon Datta
Sezai Tokat
Sahin Uyaver
Huseyin Kusetogullari
Rabie A. Ramadan
Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter Data
IEEE Access
Data augmentation
deep learning
machine learning
neural networks
sentiment analysis
Turkish
title Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter Data
title_full Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter Data
title_fullStr Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter Data
title_full_unstemmed Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter Data
title_short Deep Sentiment Analysis: A Case Study on Stemmed Turkish Twitter Data
title_sort deep sentiment analysis a case study on stemmed turkish twitter data
topic Data augmentation
deep learning
machine learning
neural networks
sentiment analysis
Turkish
url https://ieeexplore.ieee.org/document/9395633/
work_keys_str_mv AT harisuabdullahishehu deepsentimentanalysisacasestudyonstemmedturkishtwitterdata
AT mdhaidarsharif deepsentimentanalysisacasestudyonstemmedturkishtwitterdata
AT mdharisuddinsharif deepsentimentanalysisacasestudyonstemmedturkishtwitterdata
AT ripondatta deepsentimentanalysisacasestudyonstemmedturkishtwitterdata
AT sezaitokat deepsentimentanalysisacasestudyonstemmedturkishtwitterdata
AT sahinuyaver deepsentimentanalysisacasestudyonstemmedturkishtwitterdata
AT huseyinkusetogullari deepsentimentanalysisacasestudyonstemmedturkishtwitterdata
AT rabiearamadan deepsentimentanalysisacasestudyonstemmedturkishtwitterdata