Product name and associated user sentiment retrieval from tweets

With a growing section of the web dedicated to the reviews, discussions and advertisement of products through micro-blogging, it has become imperative to develop techniques for automated Product name extraction from user generated short texts. In this report, we propose a system for mining Tweets to...

Full description

Bibliographic Details
Main Author: Saraf, Avnish
Other Authors: Gao Cong
Format: Final Year Project (FYP)
Language:English
Published: 2015
Subjects:
Online Access:http://hdl.handle.net/10356/63473
_version_ 1811696022314811392
author Saraf, Avnish
author2 Gao Cong
author_facet Gao Cong
Saraf, Avnish
author_sort Saraf, Avnish
collection NTU
description With a growing section of the web dedicated to the reviews, discussions and advertisement of products through micro-blogging, it has become imperative to develop techniques for automated Product name extraction from user generated short texts. In this report, we propose a system for mining Tweets to analyze and extract the product name mention and the corresponding sentiments towards the product. We survey the information retrieval research landscape and decide on using a hybrid method for product name extraction. Our novel method combines the fuzzy dictionary matching approach to a CRFbased Named Entity Recognition approach to handle the inconsistencies of user generated short texts during extraction. Further, we also probe the widely popular sentiment mining field. We begin by studying the existing works and then propose a Machine Learning based approach operating at the sentence level granularity adapted suitably for handling micro blogs. For evaluation, we generate a dataset of 2,032 Tweets, anually annotated with the associated sentiment and the product name mentions. Evaluation on this data shows that our Hybrid method outperforms all the existing methods and achieves a Precision of 0.95, Recall of 0.98 and F1 score of 0.97 along with a 69% accurate sentiment analysis. We also provide an extensive comparison of our algorithm with one of the most popular NER systems available, the Stanford NER and show that our method produces a 38% improvement over it for user generated micro text. A detailed analysis of the performance of the individual components is also provided to establish the synergic performance of the hybrid method as compared to the fuzzy dictionary matching method and the CRF method individually.
first_indexed 2024-10-01T07:32:45Z
format Final Year Project (FYP)
id ntu-10356/63473
institution Nanyang Technological University
language English
last_indexed 2024-10-01T07:32:45Z
publishDate 2015
record_format dspace
spelling ntu-10356/634732023-03-03T20:53:28Z Product name and associated user sentiment retrieval from tweets Saraf, Avnish Gao Cong School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval With a growing section of the web dedicated to the reviews, discussions and advertisement of products through micro-blogging, it has become imperative to develop techniques for automated Product name extraction from user generated short texts. In this report, we propose a system for mining Tweets to analyze and extract the product name mention and the corresponding sentiments towards the product. We survey the information retrieval research landscape and decide on using a hybrid method for product name extraction. Our novel method combines the fuzzy dictionary matching approach to a CRFbased Named Entity Recognition approach to handle the inconsistencies of user generated short texts during extraction. Further, we also probe the widely popular sentiment mining field. We begin by studying the existing works and then propose a Machine Learning based approach operating at the sentence level granularity adapted suitably for handling micro blogs. For evaluation, we generate a dataset of 2,032 Tweets, anually annotated with the associated sentiment and the product name mentions. Evaluation on this data shows that our Hybrid method outperforms all the existing methods and achieves a Precision of 0.95, Recall of 0.98 and F1 score of 0.97 along with a 69% accurate sentiment analysis. We also provide an extensive comparison of our algorithm with one of the most popular NER systems available, the Stanford NER and show that our method produces a 38% improvement over it for user generated micro text. A detailed analysis of the performance of the individual components is also provided to establish the synergic performance of the hybrid method as compared to the fuzzy dictionary matching method and the CRF method individually. Bachelor of Engineering (Computer Science) 2015-05-14T02:08:23Z 2015-05-14T02:08:23Z 2015 2015 Final Year Project (FYP) http://hdl.handle.net/10356/63473 en Nanyang Technological University 50 p. application/pdf
spellingShingle DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
Saraf, Avnish
Product name and associated user sentiment retrieval from tweets
title Product name and associated user sentiment retrieval from tweets
title_full Product name and associated user sentiment retrieval from tweets
title_fullStr Product name and associated user sentiment retrieval from tweets
title_full_unstemmed Product name and associated user sentiment retrieval from tweets
title_short Product name and associated user sentiment retrieval from tweets
title_sort product name and associated user sentiment retrieval from tweets
topic DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval
url http://hdl.handle.net/10356/63473
work_keys_str_mv AT sarafavnish productnameandassociatedusersentimentretrievalfromtweets