An efficient Map Reduce-Based Hybrid NBC-TFIDF algorithm to mine the public sentiment on diabetes mellitus – A big data approach

The increase in the usage of internet and social media has enabled people exchange views, opinions and thoughts as never before. This exchange of data has paved the way for sentiment analysis. The basic task of sentiment analysis is to classify the data into positive, negative and neutral. In this p...

Full description

Bibliographic Details
Main Authors: J. Ramsingh, V. Bhuvaneswari
Format: Article
Language:English
Published: Elsevier 2021-10-01
Series:Journal of King Saud University: Computer and Information Sciences
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1319157818302593
_version_ 1819099040478920704
author J. Ramsingh
V. Bhuvaneswari
author_facet J. Ramsingh
V. Bhuvaneswari
author_sort J. Ramsingh
collection DOAJ
description The increase in the usage of internet and social media has enabled people exchange views, opinions and thoughts as never before. This exchange of data has paved the way for sentiment analysis. The basic task of sentiment analysis is to classify the data into positive, negative and neutral. In this paper an effective MapReduce-Based Hybrid NBC-TFIDF (Naive Bayes Classifier -Term Frequency Inverse Document Frequency) algorithm is proposed to mine people sentiment. A Map Reduce-Based Hybrid NBC is employed to classify the data based on the polarity score of each sentence in social media data. The polarity score is calculated using the emotion corpus and the Diabetic corpus is created using food Glycemic Index and physical activity index. This study analyses the correlation of food habits, physical activity and diabetic risk factors among Indian population using social network data. Around two million data has been identified for the study and the study is restricted to India. The experimental result shows that MapReduce-Based Hybrid NBC–TFIDF performs efficiently in multimode cluster. The results reveal that no individual factor is associated with diabetic risk and also a group of common factors contribute to diabetes mellitus. It is found that 60% of the social media data had positive polarity about the food items that are high in Glycemic Index which is the main root cause for type – 2 Diabetes. This Big-Data analysis reveals that the young generations of India are unaware of risk factors of Diabetes mellitus.
first_indexed 2024-12-22T00:40:33Z
format Article
id doaj.art-2d29ca04c623494daa087401bbcb95ba
institution Directory Open Access Journal
issn 1319-1578
language English
last_indexed 2024-12-22T00:40:33Z
publishDate 2021-10-01
publisher Elsevier
record_format Article
series Journal of King Saud University: Computer and Information Sciences
spelling doaj.art-2d29ca04c623494daa087401bbcb95ba2022-12-21T18:44:41ZengElsevierJournal of King Saud University: Computer and Information Sciences1319-15782021-10-0133810181029An efficient Map Reduce-Based Hybrid NBC-TFIDF algorithm to mine the public sentiment on diabetes mellitus – A big data approachJ. Ramsingh0V. Bhuvaneswari1Corresponding author.; Department of Computer Applications, Bharathiar University, Coimbatore, IndiaDepartment of Computer Applications, Bharathiar University, Coimbatore, IndiaThe increase in the usage of internet and social media has enabled people exchange views, opinions and thoughts as never before. This exchange of data has paved the way for sentiment analysis. The basic task of sentiment analysis is to classify the data into positive, negative and neutral. In this paper an effective MapReduce-Based Hybrid NBC-TFIDF (Naive Bayes Classifier -Term Frequency Inverse Document Frequency) algorithm is proposed to mine people sentiment. A Map Reduce-Based Hybrid NBC is employed to classify the data based on the polarity score of each sentence in social media data. The polarity score is calculated using the emotion corpus and the Diabetic corpus is created using food Glycemic Index and physical activity index. This study analyses the correlation of food habits, physical activity and diabetic risk factors among Indian population using social network data. Around two million data has been identified for the study and the study is restricted to India. The experimental result shows that MapReduce-Based Hybrid NBC–TFIDF performs efficiently in multimode cluster. The results reveal that no individual factor is associated with diabetic risk and also a group of common factors contribute to diabetes mellitus. It is found that 60% of the social media data had positive polarity about the food items that are high in Glycemic Index which is the main root cause for type – 2 Diabetes. This Big-Data analysis reveals that the young generations of India are unaware of risk factors of Diabetes mellitus.http://www.sciencedirect.com/science/article/pii/S1319157818302593Big-DataMap reduceOpinionSocial mediaDiabetes
spellingShingle J. Ramsingh
V. Bhuvaneswari
An efficient Map Reduce-Based Hybrid NBC-TFIDF algorithm to mine the public sentiment on diabetes mellitus – A big data approach
Journal of King Saud University: Computer and Information Sciences
Big-Data
Map reduce
Opinion
Social media
Diabetes
title An efficient Map Reduce-Based Hybrid NBC-TFIDF algorithm to mine the public sentiment on diabetes mellitus – A big data approach
title_full An efficient Map Reduce-Based Hybrid NBC-TFIDF algorithm to mine the public sentiment on diabetes mellitus – A big data approach
title_fullStr An efficient Map Reduce-Based Hybrid NBC-TFIDF algorithm to mine the public sentiment on diabetes mellitus – A big data approach
title_full_unstemmed An efficient Map Reduce-Based Hybrid NBC-TFIDF algorithm to mine the public sentiment on diabetes mellitus – A big data approach
title_short An efficient Map Reduce-Based Hybrid NBC-TFIDF algorithm to mine the public sentiment on diabetes mellitus – A big data approach
title_sort efficient map reduce based hybrid nbc tfidf algorithm to mine the public sentiment on diabetes mellitus a big data approach
topic Big-Data
Map reduce
Opinion
Social media
Diabetes
url http://www.sciencedirect.com/science/article/pii/S1319157818302593
work_keys_str_mv AT jramsingh anefficientmapreducebasedhybridnbctfidfalgorithmtominethepublicsentimentondiabetesmellitusabigdataapproach
AT vbhuvaneswari anefficientmapreducebasedhybridnbctfidfalgorithmtominethepublicsentimentondiabetesmellitusabigdataapproach
AT jramsingh efficientmapreducebasedhybridnbctfidfalgorithmtominethepublicsentimentondiabetesmellitusabigdataapproach
AT vbhuvaneswari efficientmapreducebasedhybridnbctfidfalgorithmtominethepublicsentimentondiabetesmellitusabigdataapproach