Toward efficient deep learning with sparse neural networks

<p>Despite the tremendous success that deep learning has achieved in recent years, it remains challenging to deal with the excessive computational and memory cost involved in executing deep learning based applications. To address the challenge, this thesis focuses on studying sparse neural net...

Full description

Bibliographic Details
Main Author:	Lee, N
Other Authors:	Torr, PHS
Format:	Thesis
Language:	English
Published:	2020
Subjects:	Deep learning

_version_	1826316023748362240
author	Lee, N
author2	Torr, PHS
author_facet	Torr, PHS Lee, N
author_sort	Lee, N
collection	OXFORD
description	<p>Despite the tremendous success that deep learning has achieved in recent years, it remains challenging to deal with the excessive computational and memory cost involved in executing deep learning based applications. To address the challenge, this thesis focuses on studying sparse neural networks, particularly around their construction, initialization, and large-scale training aspects, as an attempt to take a step toward efficient deep learning.</p> <p>Firstly, this thesis addresses the problem of finding sparse neural networks by pruning. Network pruning is an effective methodology to sparsify neural networks, and yet, existing approaches often introduce hyperparameters that either need to be tuned with expert knowledge or are based on ad-hoc intuitions, and typically entails iterative training steps. Alternatively, this thesis begins with proposing an efficient pruning method that is applied to a neural network prior to training in a single shot. The obtained sparse neural network using this method, once trained, exhibit state-of-the-art performance on various image classification tasks.</p> <p>Albeit efficient, it remains unclear exactly why this approach of pruning at initialization can be effective. This thesis then extends this method by developing a new perspective, from which the problem of finding trainable sparse neural networks is approached based on network initialization. Being a key to the success of finding and training sparse neural networks, this thesis proposes a sufficient initialization condition that can be easily satisfied with a simple optimization step and, once achieved, accelerates training sparse neural networks quite significantly.</p> <p>While sparse neural networks can be obtained by pruning at initialization, there has been little study concerning the subsequent training of these sparse networks. This thesis lastly concentrates on studying data parallelism -- a straightforward approach to speed up neural network training by parallelizing it using a distributed computing system -- under the influence of sparsity. To this end, the effects of data parallelism and sparsity are first measured accurately based on extensive experiments which are accompanied by metaparameter search. Then, this thesis establishes theoretical results that precisely account for these effects, which have only been addressed partially and empirically and thus remained as debatable.</p>
first_indexed	2024-03-06T18:01:53Z
format	Thesis
id	oxford-uuid:000e9d44-0229-48a3-84b0-dc17a8e96ccf
institution	University of Oxford
language	English
last_indexed	2024-12-09T03:36:21Z
publishDate	2020
record_format	dspace
spelling	oxford-uuid:000e9d44-0229-48a3-84b0-dc17a8e96ccf2024-12-02T09:52:35ZToward efficient deep learning with sparse neural networksThesishttp://purl.org/coar/resource_type/c_db06uuid:000e9d44-0229-48a3-84b0-dc17a8e96ccfDeep learningEnglishHyrax Deposit2020Lee, NTorr, PHS<p>Despite the tremendous success that deep learning has achieved in recent years, it remains challenging to deal with the excessive computational and memory cost involved in executing deep learning based applications. To address the challenge, this thesis focuses on studying sparse neural networks, particularly around their construction, initialization, and large-scale training aspects, as an attempt to take a step toward efficient deep learning.</p> <p>Firstly, this thesis addresses the problem of finding sparse neural networks by pruning. Network pruning is an effective methodology to sparsify neural networks, and yet, existing approaches often introduce hyperparameters that either need to be tuned with expert knowledge or are based on ad-hoc intuitions, and typically entails iterative training steps. Alternatively, this thesis begins with proposing an efficient pruning method that is applied to a neural network prior to training in a single shot. The obtained sparse neural network using this method, once trained, exhibit state-of-the-art performance on various image classification tasks.</p> <p>Albeit efficient, it remains unclear exactly why this approach of pruning at initialization can be effective. This thesis then extends this method by developing a new perspective, from which the problem of finding trainable sparse neural networks is approached based on network initialization. Being a key to the success of finding and training sparse neural networks, this thesis proposes a sufficient initialization condition that can be easily satisfied with a simple optimization step and, once achieved, accelerates training sparse neural networks quite significantly.</p> <p>While sparse neural networks can be obtained by pruning at initialization, there has been little study concerning the subsequent training of these sparse networks. This thesis lastly concentrates on studying data parallelism -- a straightforward approach to speed up neural network training by parallelizing it using a distributed computing system -- under the influence of sparsity. To this end, the effects of data parallelism and sparsity are first measured accurately based on extensive experiments which are accompanied by metaparameter search. Then, this thesis establishes theoretical results that precisely account for these effects, which have only been addressed partially and empirically and thus remained as debatable.</p>
spellingShingle	Deep learning Lee, N Toward efficient deep learning with sparse neural networks
title	Toward efficient deep learning with sparse neural networks
title_full	Toward efficient deep learning with sparse neural networks
title_fullStr	Toward efficient deep learning with sparse neural networks
title_full_unstemmed	Toward efficient deep learning with sparse neural networks
title_short	Toward efficient deep learning with sparse neural networks
title_sort	toward efficient deep learning with sparse neural networks
topic	Deep learning
work_keys_str_mv	AT leen towardefficientdeeplearningwithsparseneuralnetworks

Toward efficient deep learning with sparse neural networks

Similar Items