Opinion mining on product reviews from document to clause

Product reviews contain valuable information about product features and consumers’ purchasing preferences. Overwhelming amounts of such reviews have been generated daily, which demands an automatic opinion mining process on product reviews. On the other hand, sentiment analysis, also known as opinio...

Full description

Bibliographic Details
Main Author: Ting, Lin
Other Authors: Sun Aixin
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/163981
_version_ 1811685523303956480
author Ting, Lin
author2 Sun Aixin
author_facet Sun Aixin
Ting, Lin
author_sort Ting, Lin
collection NTU
description Product reviews contain valuable information about product features and consumers’ purchasing preferences. Overwhelming amounts of such reviews have been generated daily, which demands an automatic opinion mining process on product reviews. On the other hand, sentiment analysis, also known as opinion mining, aims to discover sentiment polarity from opinionated documents. Opinion mining on product reviews involves multiple tasks, which can be categorized into two areas by their analysis granularity: coarse-grained and fine-grained sentiment analysis. However, there are limitations in existing solutions for the tasks. We propose different approaches in this thesis to overcome the challenges in a few tasks of opinion mining on product reviews. The coarse-grained task predicts sentiment polarity for a review as a whole, often referred to as document-level opinion mining. Existing methods neglect that opinionated reviews are subjective and often contain conflicting sentiments towards various aspects of a product, leading to poor performance predicting more granular sentiment scores. Moreover, the exact opinion-bearing words might indicate different satisfaction levels from different users, reflected as different sentiment polarity scores. Based on our observation, users tend to have a consistent word choice over time for expressing the same level of satisfaction. We propose H-URA model with a simple architecture to mitigate the user-bais in choosing opinion-bearing words while better capturing the salient sentiment indicator in review documents. Experiments on benchmark datasets show that our pro- posed method is effective and performs better than baseline models for document-level sentiment prediction. The fine-grained sentiment analysis aims to predict sentiment for product aspects, often referred to as aspect-based sentiment analysis (ABSA). Existing methods tend to study ABSA at the sentence level and are affected by the inherent noise from non-target aspects in the same sentence. On the other hand, our observation of multiple review datasets shows that an elementary discourse unit (EDU) in a sentence tends to express a single aspect with a unitary sentiment polarity. We verify our assumptions on EDU by annotating a large number of reviews.1 Hence, we propose EDU-Capsule model to learn target-specific representation at the EDU level. Following the same assumption on EDUs, we propose EDU-Attention model with a focus on complex sentences for the ABSA task. The proposed models perform better than strong baseline models in ABSA tasks. To further demonstrate the effectiveness of the EDU-based sentiment analysis, we design an opinion summarization system with EDUs in reviews as basic text units. The system generates two types of summaries: structured and text-based summaries. Our proposed generative model is fully unsupervised, without the need for reference summaries. To conclude, we observe bias among review authors in choosing opinion-bearing words and propose H-URA model to learn such bias for predicting a precise sentiment score at a review level. Through observation of multiple datasets and annotating a large number of reviews, we assume an EDU expresses unitary sentiment towards a single aspect. Then, we propose EDU-Capsule model for studying ABSA tasks at the EDU level. Also, we design EDU-Attention model with a focus on complex sentences in solving the ABSA task, based on the same assumptions on EDUs. We have conducted extensive experiments on multiple benchmark datasets, and all proposed models achieve better performance compared to strong baselines. Lastly, we design a practical opinion summarization system with EDUs as basic text units for generating structured and text-based summaries.
first_indexed 2024-10-01T04:45:52Z
format Thesis-Doctor of Philosophy
id ntu-10356/163981
institution Nanyang Technological University
language English
last_indexed 2024-10-01T04:45:52Z
publishDate 2022
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1639812023-01-03T05:05:24Z Opinion mining on product reviews from document to clause Ting, Lin Sun Aixin School of Computer Science and Engineering AXSun@ntu.edu.sg Engineering::Computer science and engineering Product reviews contain valuable information about product features and consumers’ purchasing preferences. Overwhelming amounts of such reviews have been generated daily, which demands an automatic opinion mining process on product reviews. On the other hand, sentiment analysis, also known as opinion mining, aims to discover sentiment polarity from opinionated documents. Opinion mining on product reviews involves multiple tasks, which can be categorized into two areas by their analysis granularity: coarse-grained and fine-grained sentiment analysis. However, there are limitations in existing solutions for the tasks. We propose different approaches in this thesis to overcome the challenges in a few tasks of opinion mining on product reviews. The coarse-grained task predicts sentiment polarity for a review as a whole, often referred to as document-level opinion mining. Existing methods neglect that opinionated reviews are subjective and often contain conflicting sentiments towards various aspects of a product, leading to poor performance predicting more granular sentiment scores. Moreover, the exact opinion-bearing words might indicate different satisfaction levels from different users, reflected as different sentiment polarity scores. Based on our observation, users tend to have a consistent word choice over time for expressing the same level of satisfaction. We propose H-URA model with a simple architecture to mitigate the user-bais in choosing opinion-bearing words while better capturing the salient sentiment indicator in review documents. Experiments on benchmark datasets show that our pro- posed method is effective and performs better than baseline models for document-level sentiment prediction. The fine-grained sentiment analysis aims to predict sentiment for product aspects, often referred to as aspect-based sentiment analysis (ABSA). Existing methods tend to study ABSA at the sentence level and are affected by the inherent noise from non-target aspects in the same sentence. On the other hand, our observation of multiple review datasets shows that an elementary discourse unit (EDU) in a sentence tends to express a single aspect with a unitary sentiment polarity. We verify our assumptions on EDU by annotating a large number of reviews.1 Hence, we propose EDU-Capsule model to learn target-specific representation at the EDU level. Following the same assumption on EDUs, we propose EDU-Attention model with a focus on complex sentences for the ABSA task. The proposed models perform better than strong baseline models in ABSA tasks. To further demonstrate the effectiveness of the EDU-based sentiment analysis, we design an opinion summarization system with EDUs in reviews as basic text units. The system generates two types of summaries: structured and text-based summaries. Our proposed generative model is fully unsupervised, without the need for reference summaries. To conclude, we observe bias among review authors in choosing opinion-bearing words and propose H-URA model to learn such bias for predicting a precise sentiment score at a review level. Through observation of multiple datasets and annotating a large number of reviews, we assume an EDU expresses unitary sentiment towards a single aspect. Then, we propose EDU-Capsule model for studying ABSA tasks at the EDU level. Also, we design EDU-Attention model with a focus on complex sentences in solving the ABSA task, based on the same assumptions on EDUs. We have conducted extensive experiments on multiple benchmark datasets, and all proposed models achieve better performance compared to strong baselines. Lastly, we design a practical opinion summarization system with EDUs as basic text units for generating structured and text-based summaries. Doctor of Philosophy 2022-12-28T00:43:47Z 2022-12-28T00:43:47Z 2022 Thesis-Doctor of Philosophy Ting, L. (2022). Opinion mining on product reviews from document to clause. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/163981 https://hdl.handle.net/10356/163981 10.32657/10356/163981 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
spellingShingle Engineering::Computer science and engineering
Ting, Lin
Opinion mining on product reviews from document to clause
title Opinion mining on product reviews from document to clause
title_full Opinion mining on product reviews from document to clause
title_fullStr Opinion mining on product reviews from document to clause
title_full_unstemmed Opinion mining on product reviews from document to clause
title_short Opinion mining on product reviews from document to clause
title_sort opinion mining on product reviews from document to clause
topic Engineering::Computer science and engineering
url https://hdl.handle.net/10356/163981
work_keys_str_mv AT tinglin opinionminingonproductreviewsfromdocumenttoclause