Towards understanding residual neural networks

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019

Detalles Bibliográficos
Autor principal: Zeng, Brandon.
Otros Autores: Aleksander Ma̧dry.
Formato: Tesis
Lenguaje:eng
Publicado: Massachusetts Institute of Technology 2019
Materias:
Acceso en línea:https://hdl.handle.net/1721.1/123067
_version_ 1826206279996014592
author Zeng, Brandon.
author2 Aleksander Ma̧dry.
author_facet Aleksander Ma̧dry.
Zeng, Brandon.
author_sort Zeng, Brandon.
collection MIT
description Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
first_indexed 2024-09-23T13:26:57Z
format Thesis
id mit-1721.1/123067
institution Massachusetts Institute of Technology
language eng
last_indexed 2024-09-23T13:26:57Z
publishDate 2019
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1230672019-11-22T03:17:44Z Towards understanding residual neural networks Zeng, Brandon. Aleksander Ma̧dry. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Electrical Engineering and Computer Science. Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019 Cataloged from PDF version of thesis. Includes bibliographical references (page 37). Residual networks (ResNets) are now a prominent architecture in the field of deep learning. However, an explanation for their success remains elusive. The original view is that residual connections allows for the training of deeper networks, but it is not clear that added layers are always useful, or even how they are used. In this work, we find that residual connections distribute learning behavior across layers, allowing resnets to indeed effectively use deeper layers and outperform standard networks. We support this explanation with results for network gradients and representation learning that show that residual connections make the training of individual residual blocks easier. by Brandon Zeng. M. Eng. M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science 2019-11-22T00:09:30Z 2019-11-22T00:09:30Z 2019 2019 Thesis https://hdl.handle.net/1721.1/123067 1127292128 eng MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission. http://dspace.mit.edu/handle/1721.1/7582 37 pages application/pdf Massachusetts Institute of Technology
spellingShingle Electrical Engineering and Computer Science.
Zeng, Brandon.
Towards understanding residual neural networks
title Towards understanding residual neural networks
title_full Towards understanding residual neural networks
title_fullStr Towards understanding residual neural networks
title_full_unstemmed Towards understanding residual neural networks
title_short Towards understanding residual neural networks
title_sort towards understanding residual neural networks
topic Electrical Engineering and Computer Science.
url https://hdl.handle.net/1721.1/123067
work_keys_str_mv AT zengbrandon towardsunderstandingresidualneuralnetworks