Efficient semantic retrieval on K-segment coresets of user videos

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September 2015.

Bibliographic Details
Main Author:	Kandel, Pramod
Other Authors:	Daniela Rus and Guy Rosman.
Format:	Thesis
Language:	eng
Published:	Massachusetts Institute of Technology 2017
Subjects:	Electrical Engineering and Computer Science.
Online Access:	http://hdl.handle.net/1721.1/106443

_version_	1811082687264325632
author	Kandel, Pramod
author2	Daniela Rus and Guy Rosman.
author_facet	Daniela Rus and Guy Rosman. Kandel, Pramod
author_sort	Kandel, Pramod
collection	MIT
description	Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September 2015.
first_indexed	2024-09-23T12:07:20Z
format	Thesis
id	mit-1721.1/106443
institution	Massachusetts Institute of Technology
language	eng
last_indexed	2024-09-23T12:07:20Z
publishDate	2017
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1064432019-04-10T15:59:07Z Efficient semantic retrieval on K-segment coresets of user videos Kandel, Pramod Daniela Rus and Guy Rosman. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September 2015. "August 2015." Cataloged from PDF version of thesis. Includes bibliographical references (pages 102-107). Every day, we collect and store various kinds of data with our modern sensors, phones, cameras, and various gadgets. One of the richest available data is video data. We take numerous hours of videos with our phones and cameras, and store them in computers or cloud. However, because recording videos produce large files, it is hard to search and locate for specific video segments within a video library. We might need the part where "Matt was playing guitar", or we might want to see "the glimpse of John's laptop" among hours of video data that contain those pieces. The goal of this thesis is to create a system that is able to retrieve efficiently the relevant segments(frames) in the video by allowing users to do textual search based on objects of the video, such as "guitar" or "laptop". A big challenge with videos is the huge space required to store them, therefore making it difficult to retrieve and analyze videos. This thesis presents an efficient compression method, which uses k-segment mean coresets to represent the video data using fewer frames while preserving the information content in the original data set. The system then uses a state-of-the-art object detector to analyze and detect objects in the reduced data. The objects and corresponding frames are stored and cross-linked to the original data to enable retrieval. The system allows users to pose text queries about objects in the videos. It is important that the retrieval of the stored objects is as efficient and meaningful as possible. This thesis presents a retrieval algorithm, also based on the k-segment mean coreset algorithm, which allows efficient any-time retrieval of the detected objects, retrieving the "more preferred" or "more important" frames earlier. The system presents the any-time results to the users in an incremental way. This thesis describes the architecture and modules of the objects retrieval system for video data. The modules include the user interface for making the search query and displaying the results, the module for video compression with coresets, the object-detection module, the retrieval module, and the data flow between them. This thesis describes an implementation of this system, the algorithms used, and a suite of experiments to validate and evaluate the algorithms. The results show that using coresets, it is possible to identify, store, and efficiently retrieve video segments by specifying the objects in video data. by Pramod Kandel. M. Eng. 2017-01-12T18:33:43Z 2017-01-12T18:33:43Z 2015 Thesis http://hdl.handle.net/1721.1/106443 967346430 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 107 pages application/pdf Massachusetts Institute of Technology
spellingShingle	Electrical Engineering and Computer Science. Kandel, Pramod Efficient semantic retrieval on K-segment coresets of user videos
title	Efficient semantic retrieval on K-segment coresets of user videos
title_full	Efficient semantic retrieval on K-segment coresets of user videos
title_fullStr	Efficient semantic retrieval on K-segment coresets of user videos
title_full_unstemmed	Efficient semantic retrieval on K-segment coresets of user videos
title_short	Efficient semantic retrieval on K-segment coresets of user videos
title_sort	efficient semantic retrieval on k segment coresets of user videos
topic	Electrical Engineering and Computer Science.
url	http://hdl.handle.net/1721.1/106443
work_keys_str_mv	AT kandelpramod efficientsemanticretrievalonksegmentcoresetsofuservideos

Efficient semantic retrieval on K-segment coresets of user videos

Similar Items