Interpretable neural networks via alignment and dpstribution Propagation

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.

Bibliographic Details
Main Author: Malalur, Paresh(Paresh G.)
Other Authors: Tommi Jaakkola.
Format: Thesis
Language:eng
Published: Massachusetts Institute of Technology 2019
Subjects:
Online Access:https://hdl.handle.net/1721.1/122686
_version_ 1811080427139497984
author Malalur, Paresh(Paresh G.)
author2 Tommi Jaakkola.
author_facet Tommi Jaakkola.
Malalur, Paresh(Paresh G.)
author_sort Malalur, Paresh(Paresh G.)
collection MIT
description This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
first_indexed 2024-09-23T11:31:30Z
format Thesis
id mit-1721.1/122686
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T11:31:30Z
publishDate 2019
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1226862019-11-21T03:06:31Z Interpretable neural networks via alignment and dpstribution Propagation Malalur, Paresh(Paresh G.) Tommi Jaakkola. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Electrical Engineering and Computer Science. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 145-150). In this thesis, we aim to develop methodologies to better understand and improve the performance of Deep Neural Networks in various settings where data is limited or missing. Unlike data-rich tasks where neural networks have achieved human-level performance, other problems are naturally data limited where these models have fallen short of human level performance and where there is abundant room for improvement. We focus on three types of problems where data is limited - one-shot learning and open-set recognition in the one-shot setting, unsupervised learning, and classification with missing data. The first setting of limited data that we tackle is when there are only few examples per object type. During object classification, an attention mechanism can be used to highlight the area of the image that the model focuses on thus offering a narrow view into the mechanism of classification. We expand on this idea by forcing the method to explicitly align images to be classified to reference images representing the classes. The mechanism of alignment is learned and therefore does not require that the reference objects are anything like those being classified. Beyond explanation, our exemplar based cross-alignment method enables classification with only a single example per category (one-shot) or in the absence of any labels about new classes (open-set). While one-shot and open-set recognition operate in cases where complete data is available for few examples, unsupervised and missing data setting focus on cases where the labels are missing or where only partial input is available correspondingly. Variational Auto-encoders are a popular unsupervised learning model which learn how to map the input distribution into a simple latent distribution. We introduce a mechanism of approximate propagation of Gaussian densities through neural networks using the Hellinger distance metric to find the best approximation and demonstrate how to use this framework to improve the latent code efficiency of Variational Auto- Encoders. Expanding on this idea further, we introduce a novel method to learn the mapping between the input space and latent space which further improves the efficiency of the latent code by overcoming the variational bound. The final limited data setting we explore is when the input data is incomplete or very noisy. Neural Networks are inherently feed-forward and hence inference methods developed for probabilistic models can not be applied directly. We introduce two different methods to handle missing data. We first introduce a simple feed-forward model that redefines the linear operator as an ensemble to reweight the activations when portions of its receptive field are missing. We then use some of the insights gained to develop deep networks that propagate distributions of activations instead of point activations allowing us to use message passing methods to compensate for missing data while maintaining the feed-forward style approach when data is not missing. by Paresh Malalur. Ph. D. Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science 2019-11-04T19:53:13Z 2019-11-04T19:53:13Z 2019 2019 Thesis https://hdl.handle.net/1721.1/122686 1124682413 eng MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582 150 pages application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Malalur, Paresh(Paresh G.)
Interpretable neural networks via alignment and dpstribution Propagation
title Interpretable neural networks via alignment and dpstribution Propagation
title_full Interpretable neural networks via alignment and dpstribution Propagation
title_fullStr Interpretable neural networks via alignment and dpstribution Propagation
title_full_unstemmed Interpretable neural networks via alignment and dpstribution Propagation
title_short Interpretable neural networks via alignment and dpstribution Propagation
title_sort interpretable neural networks via alignment and dpstribution propagation
topic Electrical Engineering and Computer Science.
url https://hdl.handle.net/1721.1/122686
work_keys_str_mv AT malalurpareshpareshg interpretableneuralnetworksviaalignmentanddpstributionpropagation