Event-centric Twitter photo summarization

Thesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2014.

Bibliographic Details
Main Author: Wen, Chung-Lin, S.M. Massachusetts Institute of Technology
Other Authors: Ramesh Raskar.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2014
Subjects:
Online Access:http://hdl.handle.net/1721.1/91417
_version_ 1826200639828393984
author Wen, Chung-Lin, S.M. Massachusetts Institute of Technology
author2 Ramesh Raskar.
author_facet Ramesh Raskar.
Wen, Chung-Lin, S.M. Massachusetts Institute of Technology
author_sort Wen, Chung-Lin, S.M. Massachusetts Institute of Technology
collection MIT
description Thesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2014.
first_indexed 2024-09-23T11:39:34Z
format Thesis
id mit-1721.1/91417
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T11:39:34Z
publishDate 2014
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/914172022-01-18T16:18:51Z Event-centric Twitter photo summarization Wen, Chung-Lin, S.M. Massachusetts Institute of Technology Ramesh Raskar. Massachusetts Institute of Technology. Department of Architecture. Program in Media Arts and Sciences. Program in Media Arts and Sciences (Massachusetts Institute of Technology) Architecture. Program in Media Arts and Sciences. Thesis: S.M., Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2014. 40 Cataloged from PDF version of thesis. Includes bibliographical references (pages 71-74). We develop a novel algorithm based on spectral geometry that summarize a photo collection into a small subset that represents the collection well. While the definition for a good summarization might not be unique, we focus on two metrics in this thesis: representativeness and diversity. By representativeness we mean that the sampled photo should be similar to other photos in the data set. The intuition behind this is that by regarding each photo as a "vote" towards the scene it depicts, we want to include the photos that have high "votes". Diversity is also desirable because repeating the same information is an inefficient use of the few spaces we have for summarization. We achieve these seemingly contradictory properties by applying diversified sampling on the denser part of the feature space. The proposed method uses diffusion distance to measure the distance between any given pair in the dataset. By emphasizing the connectivity of the local neighborhood, we achieve better accuracy compared to previous methods that used the global distance. Heat Kernel Signature (HKS) is then used to separate the denser part and the sparser part of the data. By intersecting the denser part generated by different features, we are able to remove most of the outliers, i.e., photos that have few similar photos in the dataset. Farthest Point Sampling (FPS) is then applied to give a diversified sampling, which produces our final summarization. The method can be applied to any image collection that has a specific topic but also a fair proportion of outliers. One scenario especially motivating us to develop this technique is the Twitter photos of a specific event. Microblogging services have became a major way that people share new information. However, the huge amount of data, the lack of structure, and the highly noisy nature prevent users from effectively mining useful information from it. There are textual data based methods but the absence of visual information makes them less valuable. To the best of our knowledge, this study is the first to address visual data in Twitter event summarization. Our method's output can produce a kind of "crowd-sourced news", useful for journalists as well as the general public. We illustrate our results by summarizing recent Twitter events and comparing them with those generated by metadata such as retweet numbers. Our results are of at least the same quality although produced by a fully automatic mechanism. In some cases, because metadata can be biased by factors such as the number of followers, our results are even better in comparison. We also note that by our initial pilot study, the photos we found with high-quality have little overlap with highly-tweeted photos. That suggests the signal we found is orthogonal to the retweet signal and the two signals can be potentially combined to achieve even better results. by Chung-Lin Wen. S.M. 2014-11-04T21:35:13Z 2014-11-04T21:35:13Z 2014 2014 Thesis http://hdl.handle.net/1721.1/91417 893607735 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 74 pages application/pdf Massachusetts Institute of Technology
spellingShingle Architecture. Program in Media Arts and Sciences.
Wen, Chung-Lin, S.M. Massachusetts Institute of Technology
Event-centric Twitter photo summarization
title Event-centric Twitter photo summarization
title_full Event-centric Twitter photo summarization
title_fullStr Event-centric Twitter photo summarization
title_full_unstemmed Event-centric Twitter photo summarization
title_short Event-centric Twitter photo summarization
title_sort event centric twitter photo summarization
topic Architecture. Program in Media Arts and Sciences.
url http://hdl.handle.net/1721.1/91417
work_keys_str_mv AT wenchunglinsmmassachusettsinstituteoftechnology eventcentrictwitterphotosummarization