Improving the accuracy of sentiment analysis using a linguistic rule-based feature selection method in tourism reviews

Sentiment Analysis technique involves extracting the relevant information from Unstructured User Reviews (UUR) dataset fetched from online and classifying them into appropriate positive and negative comments for making decisions. In UUR, data may be in noisy state, irrelevant features exist which cr...

Full description

Bibliographic Details
Main Authors: N. Saraswathi, T. Sasi Rooba, S. Chakaravarthi
Format: Article
Language:English
Published: Elsevier 2023-10-01
Series:Measurement: Sensors
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2665917423002246
_version_ 1827815733142224896
author N. Saraswathi
T. Sasi Rooba
S. Chakaravarthi
author_facet N. Saraswathi
T. Sasi Rooba
S. Chakaravarthi
author_sort N. Saraswathi
collection DOAJ
description Sentiment Analysis technique involves extracting the relevant information from Unstructured User Reviews (UUR) dataset fetched from online and classifying them into appropriate positive and negative comments for making decisions. In UUR, data may be in noisy state, irrelevant features exist which creates high dimensional feature space. To design an effective sentiment learning model, users are required to extract the most relevant sentiment features from UUR. To overcome the issue, we proposed a Linguistic rule based feature selection method for extracting and selecting the sentiment features for Sentiment Analysis as it improves the predictive performance of classification algorithms. The proposed novel feature selection method involves identifying the various sentiment features in the review dataset by using filtering methods such as POS tags, n-grams. In the ensemble model, where the Random Forest classification algorithm is trained for textual sentiment classification, the chosen sentiment feature sets are used. Finally, we test our approach using the real-time review dataset that was collected from a multitude of sources, and the results demonstrate prediction accuracy that is superior to that of existing Sentiment analysis techniques.
first_indexed 2024-03-12T00:04:50Z
format Article
id doaj.art-15b541814259432eb822fdfb0fb0d0c3
institution Directory Open Access Journal
issn 2665-9174
language English
last_indexed 2024-03-12T00:04:50Z
publishDate 2023-10-01
publisher Elsevier
record_format Article
series Measurement: Sensors
spelling doaj.art-15b541814259432eb822fdfb0fb0d0c32023-09-17T04:57:30ZengElsevierMeasurement: Sensors2665-91742023-10-0129100888Improving the accuracy of sentiment analysis using a linguistic rule-based feature selection method in tourism reviewsN. Saraswathi0T. Sasi Rooba1S. Chakaravarthi2Dept. of Computer Science and Engineering, Annamalai University, Annamalainagar, 608002, India; Corresponding author.Dept. of Computer Science and Engineering, Annamalai University, Annamalainagar, 608002, IndiaDept. of Artificial Intelligence and Data Science, Panimalar Engineering College Poonamallie, Chennai, 600123 IndiaSentiment Analysis technique involves extracting the relevant information from Unstructured User Reviews (UUR) dataset fetched from online and classifying them into appropriate positive and negative comments for making decisions. In UUR, data may be in noisy state, irrelevant features exist which creates high dimensional feature space. To design an effective sentiment learning model, users are required to extract the most relevant sentiment features from UUR. To overcome the issue, we proposed a Linguistic rule based feature selection method for extracting and selecting the sentiment features for Sentiment Analysis as it improves the predictive performance of classification algorithms. The proposed novel feature selection method involves identifying the various sentiment features in the review dataset by using filtering methods such as POS tags, n-grams. In the ensemble model, where the Random Forest classification algorithm is trained for textual sentiment classification, the chosen sentiment feature sets are used. Finally, we test our approach using the real-time review dataset that was collected from a multitude of sources, and the results demonstrate prediction accuracy that is superior to that of existing Sentiment analysis techniques.http://www.sciencedirect.com/science/article/pii/S2665917423002246Linguistic ruleSentiment analysisFeature selectionPOS tagN-gram
spellingShingle N. Saraswathi
T. Sasi Rooba
S. Chakaravarthi
Improving the accuracy of sentiment analysis using a linguistic rule-based feature selection method in tourism reviews
Measurement: Sensors
Linguistic rule
Sentiment analysis
Feature selection
POS tag
N-gram
title Improving the accuracy of sentiment analysis using a linguistic rule-based feature selection method in tourism reviews
title_full Improving the accuracy of sentiment analysis using a linguistic rule-based feature selection method in tourism reviews
title_fullStr Improving the accuracy of sentiment analysis using a linguistic rule-based feature selection method in tourism reviews
title_full_unstemmed Improving the accuracy of sentiment analysis using a linguistic rule-based feature selection method in tourism reviews
title_short Improving the accuracy of sentiment analysis using a linguistic rule-based feature selection method in tourism reviews
title_sort improving the accuracy of sentiment analysis using a linguistic rule based feature selection method in tourism reviews
topic Linguistic rule
Sentiment analysis
Feature selection
POS tag
N-gram
url http://www.sciencedirect.com/science/article/pii/S2665917423002246
work_keys_str_mv AT nsaraswathi improvingtheaccuracyofsentimentanalysisusingalinguisticrulebasedfeatureselectionmethodintourismreviews
AT tsasirooba improvingtheaccuracyofsentimentanalysisusingalinguisticrulebasedfeatureselectionmethodintourismreviews
AT schakaravarthi improvingtheaccuracyofsentimentanalysisusingalinguisticrulebasedfeatureselectionmethodintourismreviews