Predicting Drug Review Polarity Using the Combination Model of Multi-Sense Word Embedding and Fuzzy Latent Dirichlet Allocation (FLDA)

The massive volume of textual data generated in recent years has led to the development of new computer-based technologies, especially in the field of healthcare area. Sentiment analysis opens a new door in healthcare to improve public health data analysis and efficiently predict diseases. Many word...

Full description

Bibliographic Details
Main Authors:	Siyue Song, Anju P. Johnson
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Classification drug review feature extraction fuzzy system fuzzy set theory healthcare data
Online Access:	https://ieeexplore.ieee.org/document/10290872/

_version_	1797643654459817984
author	Siyue Song Anju P. Johnson
author_facet	Siyue Song Anju P. Johnson
author_sort	Siyue Song
collection	DOAJ
description	The massive volume of textual data generated in recent years has led to the development of new computer-based technologies, especially in the field of healthcare area. Sentiment analysis opens a new door in healthcare to improve public health data analysis and efficiently predict diseases. Many words in natural language have multiple meanings or senses. However, traditional algorithms mainly focus on a single meaning but cannot capture the multiple senses of the words, leading to potential inaccuracies in sentiment analysis. Additionally, dealing with vagueness in linguistic terms is a common challenge in natural language processing; particularly, applying simple frequency terms is insufficient to measure the development states of different topics. In this research, we applied two multi-sense word embedding models, Probabilistic Fasttext and Multi-sense Skip-gram, to the sentiment analysis of drug reviews. The proposed models can better represent words with multiple meanings, producing more accurate sentiment analysis results. Additionally, we compared multi-sense word embedding with single embedding models and evaluated the classification methods compared to other classical machine learning technologies. Finally, the Fuzzy system was applied to estimate the topics hidden in the drug review dataset using the Latent Dirichlet Allocation (LDA) model; the Fuzzy rule-based system was applied to explain the classification result of drug review polarity. In particular, both models can have good performances during the classification task. Probabilistic Fasttext achieved an accuracy of 82.1%, and multi-sense skip-gram achieved an accuracy of 79.8%. The work has addressed several critical challenges related to sentiment analysis of healthcare data and has proposed a comprehensive approach to tackle them. The reported results indicate promising performance and the potential future applications in other medical domains beyond drug reviews further highlight the significance of this research.
first_indexed	2024-03-11T14:18:02Z
format	Article
id	doaj.art-31b1bba87c5c4656afc1dc4fb1caf67d
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-11T14:18:02Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-31b1bba87c5c4656afc1dc4fb1caf67d2023-10-31T23:00:29ZengIEEEIEEE Access2169-35362023-01-011111853811854610.1109/ACCESS.2023.332675710290872Predicting Drug Review Polarity Using the Combination Model of Multi-Sense Word Embedding and Fuzzy Latent Dirichlet Allocation (FLDA)Siyue Song0https://orcid.org/0000-0003-2389-7503Anju P. Johnson1https://orcid.org/0000-0002-7017-1644Department of Computer Science, Centre for Industrial Analytics (CIndA), School of Computing and Engineering, University of Huddersfield, Queensgate Campus, Huddersfield, U.K.Department of Computer Science, Centre for Industrial Analytics (CIndA), School of Computing and Engineering, University of Huddersfield, Queensgate Campus, Huddersfield, U.K.The massive volume of textual data generated in recent years has led to the development of new computer-based technologies, especially in the field of healthcare area. Sentiment analysis opens a new door in healthcare to improve public health data analysis and efficiently predict diseases. Many words in natural language have multiple meanings or senses. However, traditional algorithms mainly focus on a single meaning but cannot capture the multiple senses of the words, leading to potential inaccuracies in sentiment analysis. Additionally, dealing with vagueness in linguistic terms is a common challenge in natural language processing; particularly, applying simple frequency terms is insufficient to measure the development states of different topics. In this research, we applied two multi-sense word embedding models, Probabilistic Fasttext and Multi-sense Skip-gram, to the sentiment analysis of drug reviews. The proposed models can better represent words with multiple meanings, producing more accurate sentiment analysis results. Additionally, we compared multi-sense word embedding with single embedding models and evaluated the classification methods compared to other classical machine learning technologies. Finally, the Fuzzy system was applied to estimate the topics hidden in the drug review dataset using the Latent Dirichlet Allocation (LDA) model; the Fuzzy rule-based system was applied to explain the classification result of drug review polarity. In particular, both models can have good performances during the classification task. Probabilistic Fasttext achieved an accuracy of 82.1%, and multi-sense skip-gram achieved an accuracy of 79.8%. The work has addressed several critical challenges related to sentiment analysis of healthcare data and has proposed a comprehensive approach to tackle them. The reported results indicate promising performance and the potential future applications in other medical domains beyond drug reviews further highlight the significance of this research.https://ieeexplore.ieee.org/document/10290872/Classificationdrug reviewfeature extractionfuzzy systemfuzzy set theoryhealthcare data
spellingShingle	Siyue Song Anju P. Johnson Predicting Drug Review Polarity Using the Combination Model of Multi-Sense Word Embedding and Fuzzy Latent Dirichlet Allocation (FLDA) IEEE Access Classification drug review feature extraction fuzzy system fuzzy set theory healthcare data
title	Predicting Drug Review Polarity Using the Combination Model of Multi-Sense Word Embedding and Fuzzy Latent Dirichlet Allocation (FLDA)
title_full	Predicting Drug Review Polarity Using the Combination Model of Multi-Sense Word Embedding and Fuzzy Latent Dirichlet Allocation (FLDA)
title_fullStr	Predicting Drug Review Polarity Using the Combination Model of Multi-Sense Word Embedding and Fuzzy Latent Dirichlet Allocation (FLDA)
title_full_unstemmed	Predicting Drug Review Polarity Using the Combination Model of Multi-Sense Word Embedding and Fuzzy Latent Dirichlet Allocation (FLDA)
title_short	Predicting Drug Review Polarity Using the Combination Model of Multi-Sense Word Embedding and Fuzzy Latent Dirichlet Allocation (FLDA)
title_sort	predicting drug review polarity using the combination model of multi sense word embedding and fuzzy latent dirichlet allocation flda
topic	Classification drug review feature extraction fuzzy system fuzzy set theory healthcare data
url	https://ieeexplore.ieee.org/document/10290872/
work_keys_str_mv	AT siyuesong predictingdrugreviewpolarityusingthecombinationmodelofmultisensewordembeddingandfuzzylatentdirichletallocationflda AT anjupjohnson predictingdrugreviewpolarityusingthecombinationmodelofmultisensewordembeddingandfuzzylatentdirichletallocationflda

Predicting Drug Review Polarity Using the Combination Model of Multi-Sense Word Embedding and Fuzzy Latent Dirichlet Allocation (FLDA)

Similar Items