Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.

Bibliographic Details
Main Author:	Liu, Jingjing, Ph. D. Massachusetts Institute of Technology
Other Authors:	Stephanie Seneff and Victor Zue.
Format:	Thesis
Language:	eng
Published:	Massachusetts Institute of Technology 2012
Subjects:	Electrical Engineering and Computer Science.
Online Access:	http://hdl.handle.net/1721.1/71481

_version_	1826209986490925056
author	Liu, Jingjing, Ph. D. Massachusetts Institute of Technology
author2	Stephanie Seneff and Victor Zue.
author_facet	Stephanie Seneff and Victor Zue. Liu, Jingjing, Ph. D. Massachusetts Institute of Technology
author_sort	Liu, Jingjing, Ph. D. Massachusetts Institute of Technology
collection	MIT
description	Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.
first_indexed	2024-09-23T14:38:36Z
format	Thesis
id	mit-1721.1/71481
institution	Massachusetts Institute of Technology
language	eng
last_indexed	2024-09-23T14:38:36Z
publishDate	2012
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/714812019-04-10T15:57:43Z Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction Liu, Jingjing, Ph. D. Massachusetts Institute of Technology Stephanie Seneff and Victor Zue. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012. Cataloged from PDF version of thesis. Includes bibliographical references (p. 155-164). There have been many assistant applications on mobile devices, which could help people obtain rich Web content such as user-generated data (e.g., reviews, posts, blogs, and tweets). However, online communities and social networks are expanding rapidly and it is impossible for people to browse and digest all the information via simple search interface. To help users obtain information more efficiently, both the interface for data access and the information representation need to be improved. An intuitive and personalized interface, such as a dialogue system, could be an ideal assistant, which engages a user in a continuous dialogue to garner the user's interest and capture the user's intent, and assists the user via speech-navigated interactions. In addition, there is a great need for a type of application that can harvest data from the Web, summarize the information in a concise manner, and present it in an aggregated yet natural way such as direct human dialogue. This thesis, therefore, aims to conduct research on a universal framework for developing speech-based interface that can aggregate user-generated Web content and present the summarized information via speech-based human-computer interaction. To accomplish this goal, several challenges must be met. Firstly, how to interpret users' intention from their spoken input correctly? Secondly, how to interpret the semantics and sentiment of user-generated data and aggregate them into structured yet concise summaries? Lastly, how to develop a dialogue modeling mechanism to handle discourse and present the highlighted information via natural language? This thesis explores plausible approaches to tackle these challenges. We will explore a lexicon modeling approach for semantic tagging to improve spoken language understanding and query interpretation. We will investigate a parse-and-paraphrase paradigm and a sentiment scoring mechanism for information extraction from unstructured user-generated data. We will also explore sentiment-involved dialogue modeling and corpus-based language generation approaches for dialogue and discourse. Multilingual prototype systems in multiple domains have been implemented for demonstration. by Jingjing Liu. Ph.D. 2012-07-02T15:46:39Z 2012-07-02T15:46:39Z 2012 2012 Thesis http://hdl.handle.net/1721.1/71481 795569633 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 180 p. application/pdf Massachusetts Institute of Technology
spellingShingle	Electrical Engineering and Computer Science. Liu, Jingjing, Ph. D. Massachusetts Institute of Technology Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction
title	Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction
title_full	Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction
title_fullStr	Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction
title_full_unstemmed	Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction
title_short	Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction
title_sort	harvesting and summarizing user generated content for advanced speech based human computer interaction
topic	Electrical Engineering and Computer Science.
url	http://hdl.handle.net/1721.1/71481
work_keys_str_mv	AT liujingjingphdmassachusettsinstituteoftechnology harvestingandsummarizingusergeneratedcontentforadvancedspeechbasedhumancomputerinteraction

Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction

Similar Items