Toward efficient deep learning with sparse neural networks

<p>Despite the tremendous success that deep learning has achieved in recent years, it remains challenging to deal with the excessive computational and memory cost involved in executing deep learning based applications. To address the challenge, this thesis focuses on studying sparse neural net...

Full description

Bibliographic Details
Main Author: Lee, N
Other Authors: Torr, PHS
Format: Thesis
Language:English
Published: 2020
Subjects:
_version_ 1826316023748362240
author Lee, N
author2 Torr, PHS
author_facet Torr, PHS
Lee, N
author_sort Lee, N
collection OXFORD
description <p>Despite the tremendous success that deep learning has achieved in recent years, it remains challenging to deal with the excessive computational and memory cost involved in executing deep learning based applications. To address the challenge, this thesis focuses on studying sparse neural networks, particularly around their construction, initialization, and large-scale training aspects, as an attempt to take a step toward efficient deep learning.</p> <p>Firstly, this thesis addresses the problem of finding sparse neural networks by pruning. Network pruning is an effective methodology to sparsify neural networks, and yet, existing approaches often introduce hyperparameters that either need to be tuned with expert knowledge or are based on ad-hoc intuitions, and typically entails iterative training steps. Alternatively, this thesis begins with proposing an efficient pruning method that is applied to a neural network prior to training in a single shot. The obtained sparse neural network using this method, once trained, exhibit state-of-the-art performance on various image classification tasks.</p> <p>Albeit efficient, it remains unclear exactly why this approach of pruning at initialization can be effective. This thesis then extends this method by developing a new perspective, from which the problem of finding trainable sparse neural networks is approached based on network initialization. Being a key to the success of finding and training sparse neural networks, this thesis proposes a sufficient initialization condition that can be easily satisfied with a simple optimization step and, once achieved, accelerates training sparse neural networks quite significantly.</p> <p>While sparse neural networks can be obtained by pruning at initialization, there has been little study concerning the subsequent training of these sparse networks. This thesis lastly concentrates on studying data parallelism -- a straightforward approach to speed up neural network training by parallelizing it using a distributed computing system -- under the influence of sparsity. To this end, the effects of data parallelism and sparsity are first measured accurately based on extensive experiments which are accompanied by metaparameter search. Then, this thesis establishes theoretical results that precisely account for these effects, which have only been addressed partially and empirically and thus remained as debatable.</p>
first_indexed 2024-03-06T18:01:53Z
format Thesis
id oxford-uuid:000e9d44-0229-48a3-84b0-dc17a8e96ccf
institution University of Oxford
language English
last_indexed 2024-12-09T03:36:21Z
publishDate 2020
record_format dspace
spelling oxford-uuid:000e9d44-0229-48a3-84b0-dc17a8e96ccf2024-12-02T09:52:35ZToward efficient deep learning with sparse neural networksThesishttp://purl.org/coar/resource_type/c_db06uuid:000e9d44-0229-48a3-84b0-dc17a8e96ccfDeep learningEnglishHyrax Deposit2020Lee, NTorr, PHS<p>Despite the tremendous success that deep learning has achieved in recent years, it remains challenging to deal with the excessive computational and memory cost involved in executing deep learning based applications. To address the challenge, this thesis focuses on studying sparse neural networks, particularly around their construction, initialization, and large-scale training aspects, as an attempt to take a step toward efficient deep learning.</p> <p>Firstly, this thesis addresses the problem of finding sparse neural networks by pruning. Network pruning is an effective methodology to sparsify neural networks, and yet, existing approaches often introduce hyperparameters that either need to be tuned with expert knowledge or are based on ad-hoc intuitions, and typically entails iterative training steps. Alternatively, this thesis begins with proposing an efficient pruning method that is applied to a neural network prior to training in a single shot. The obtained sparse neural network using this method, once trained, exhibit state-of-the-art performance on various image classification tasks.</p> <p>Albeit efficient, it remains unclear exactly why this approach of pruning at initialization can be effective. This thesis then extends this method by developing a new perspective, from which the problem of finding trainable sparse neural networks is approached based on network initialization. Being a key to the success of finding and training sparse neural networks, this thesis proposes a sufficient initialization condition that can be easily satisfied with a simple optimization step and, once achieved, accelerates training sparse neural networks quite significantly.</p> <p>While sparse neural networks can be obtained by pruning at initialization, there has been little study concerning the subsequent training of these sparse networks. This thesis lastly concentrates on studying data parallelism -- a straightforward approach to speed up neural network training by parallelizing it using a distributed computing system -- under the influence of sparsity. To this end, the effects of data parallelism and sparsity are first measured accurately based on extensive experiments which are accompanied by metaparameter search. Then, this thesis establishes theoretical results that precisely account for these effects, which have only been addressed partially and empirically and thus remained as debatable.</p>
spellingShingle Deep learning
Lee, N
Toward efficient deep learning with sparse neural networks
title Toward efficient deep learning with sparse neural networks
title_full Toward efficient deep learning with sparse neural networks
title_fullStr Toward efficient deep learning with sparse neural networks
title_full_unstemmed Toward efficient deep learning with sparse neural networks
title_short Toward efficient deep learning with sparse neural networks
title_sort toward efficient deep learning with sparse neural networks
topic Deep learning
work_keys_str_mv AT leen towardefficientdeeplearningwithsparseneuralnetworks