Interpretable neural networks via alignment and dpstribution Propagation

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.

Bibliographic Details
Main Author:	Malalur, Paresh(Paresh G.)
Other Authors:	Tommi Jaakkola.
Format:	Thesis
Language:	eng
Published:	Massachusetts Institute of Technology 2019
Subjects:	Electrical Engineering and Computer Science.
Online Access:	https://hdl.handle.net/1721.1/122686

_version_	1811080427139497984
author	Malalur, Paresh(Paresh G.)
author2	Tommi Jaakkola.
author_facet	Tommi Jaakkola. Malalur, Paresh(Paresh G.)
author_sort	Malalur, Paresh(Paresh G.)
collection	MIT
description	This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
first_indexed	2024-09-23T11:31:30Z
format	Thesis
id	mit-1721.1/122686
institution	Massachusetts Institute of Technology
language	eng
last_indexed	2024-09-23T11:31:30Z
publishDate	2019
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1226862019-11-21T03:06:31Z Interpretable neural networks via alignment and dpstribution Propagation Malalur, Paresh(Paresh G.) Tommi Jaakkola. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Electrical Engineering and Computer Science. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 145-150). In this thesis, we aim to develop methodologies to better understand and improve the performance of Deep Neural Networks in various settings where data is limited or missing. Unlike data-rich tasks where neural networks have achieved human-level performance, other problems are naturally data limited where these models have fallen short of human level performance and where there is abundant room for improvement. We focus on three types of problems where data is limited - one-shot learning and open-set recognition in the one-shot setting, unsupervised learning, and classification with missing data. The first setting of limited data that we tackle is when there are only few examples per object type. During object classification, an attention mechanism can be used to highlight the area of the image that the model focuses on thus offering a narrow view into the mechanism of classification. We expand on this idea by forcing the method to explicitly align images to be classified to reference images representing the classes. The mechanism of alignment is learned and therefore does not require that the reference objects are anything like those being classified. Beyond explanation, our exemplar based cross-alignment method enables classification with only a single example per category (one-shot) or in the absence of any labels about new classes (open-set). While one-shot and open-set recognition operate in cases where complete data is available for few examples, unsupervised and missing data setting focus on cases where the labels are missing or where only partial input is available correspondingly. Variational Auto-encoders are a popular unsupervised learning model which learn how to map the input distribution into a simple latent distribution. We introduce a mechanism of approximate propagation of Gaussian densities through neural networks using the Hellinger distance metric to find the best approximation and demonstrate how to use this framework to improve the latent code efficiency of Variational Auto- Encoders. Expanding on this idea further, we introduce a novel method to learn the mapping between the input space and latent space which further improves the efficiency of the latent code by overcoming the variational bound. The final limited data setting we explore is when the input data is incomplete or very noisy. Neural Networks are inherently feed-forward and hence inference methods developed for probabilistic models can not be applied directly. We introduce two different methods to handle missing data. We first introduce a simple feed-forward model that redefines the linear operator as an ensemble to reweight the activations when portions of its receptive field are missing. We then use some of the insights gained to develop deep networks that propagate distributions of activations instead of point activations allowing us to use message passing methods to compensate for missing data while maintaining the feed-forward style approach when data is not missing. by Paresh Malalur. Ph. D. Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science 2019-11-04T19:53:13Z 2019-11-04T19:53:13Z 2019 2019 Thesis https://hdl.handle.net/1721.1/122686 1124682413 eng MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582 150 pages application/pdf Massachusetts Institute of Technology
spellingShingle	Electrical Engineering and Computer Science. Malalur, Paresh(Paresh G.) Interpretable neural networks via alignment and dpstribution Propagation
title	Interpretable neural networks via alignment and dpstribution Propagation
title_full	Interpretable neural networks via alignment and dpstribution Propagation
title_fullStr	Interpretable neural networks via alignment and dpstribution Propagation
title_full_unstemmed	Interpretable neural networks via alignment and dpstribution Propagation
title_short	Interpretable neural networks via alignment and dpstribution Propagation
title_sort	interpretable neural networks via alignment and dpstribution propagation
topic	Electrical Engineering and Computer Science.
url	https://hdl.handle.net/1721.1/122686
work_keys_str_mv	AT malalurpareshpareshg interpretableneuralnetworksviaalignmentanddpstributionpropagation

Interpretable neural networks via alignment and dpstribution Propagation

Similar Items