Transformer Pruning Relation and General Neural Network Augmentation
In this thesis, a method of initializing neural networks with weights transferred from smaller trained neural network weights was investigated. We name this process augmentation and present a few versions of it, some of which involve pruning. Firstly, the pruning relation of testing loss against den...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2022
|
Online Access: | https://hdl.handle.net/1721.1/139547 |