Transformer Pruning Relation and General Neural Network Augmentation

In this thesis, a method of initializing neural networks with weights transferred from smaller trained neural network weights was investigated. We name this process augmentation and present a few versions of it, some of which involve pruning. Firstly, the pruning relation of testing loss against den...

Full description

Bibliographic Details
Main Author: Lim, Yong Hui
Other Authors: Shavit, Nir
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/139547