Continual learning for efficient machine learning

<p>Deep learning has enjoyed tremendous success over the last decade, but the training of practically useful deep models remains highly inefficient both in terms of the number of weight updates and training samples. To address one aspect of these issues, this thesis studies the continual learn...

Description complète

Détails bibliographiques
Auteur principal:	Chaudhry, A
Autres auteurs:	Dokania, P
Format:	Thèse
Langue:	English
Publié:	2020
Sujets:	Machine Learning and Artificial Intelligence Continual Learning

_version_	1826280466218483712
author	Chaudhry, A
author2	Dokania, P
author_facet	Dokania, P Chaudhry, A
author_sort	Chaudhry, A
collection	OXFORD
description	<p>Deep learning has enjoyed tremendous success over the last decade, but the training of practically useful deep models remains highly inefficient both in terms of the number of weight updates and training samples. To address one aspect of these issues, this thesis studies the continual learning setting whereby a model utilizes a sequence of tasks leveraging previous knowledge to learn new tasks quickly. The main challenge in continual learning is to keep the model from catastrophically forgetting the previous information when updating it for a new task.</p> <p>Towards this, firstly this thesis proposes a continual learning algorithm that preserves previous knowledge by regularizing the KL-divergence between the conditional likelihoods of two successive tasks. It is shown that this regularization imposes a quadratic penalty on the network weights based on the curvature at the minimum of the previous task.</p> <p>Second, this thesis presents a more efficient continual learning algorithm, utilizing an episodic memory of past tasks as a constraint such that the loss of episodic memory does not increase when a weight update is made for a new task. It is shown that using episodic memory to constrain the objective is more effective than regularizing the network parameters. Furthermore, to increase the speed of learning of new tasks, the use of compositional task descriptors using a joint embedding model is proposed that greatly improves the forward transfer.</p> <p>The episodic memory-based continual learning objective is then simplified by directly using the memory in the loss function. Despite its tendency to memorize the data present in the tiny episodic memory, the resulting algorithm is shown to generalize better than the one where memory is used as a constraint. An analysis is proposed that attributes this sur- prising generalization to the regularization effect brought by the data of new tasks.</p> <p>This algorithm is then used to learn continually from synthetic and real data. For this, a method is proposed that generates synthetic data points for each task by optimizing the forgetting loss in hindsight on the replay buffer. A nested optimization objective for continual learning is devised that effectively utilizes these synthetic points to reduce forgetting in memory-based continual learning methods.</p> <p>Finally, this thesis presents a continual learning algorithm that learns different tasks in nonoverlapping feature subspaces. It is shown that minimizing the overlap by keeping the subspaces of different tasks orthogonal to each other reduces the interference between the representations of these tasks.</p>
first_indexed	2024-03-07T00:14:08Z
format	Thesis
id	oxford-uuid:7a3e5c33-864f-4cfe-8b80-e85cbf651946
institution	University of Oxford
language	English
last_indexed	2024-03-07T00:14:08Z
publishDate	2020
record_format	dspace
spelling	oxford-uuid:7a3e5c33-864f-4cfe-8b80-e85cbf6519462022-03-26T20:42:44ZContinual learning for efficient machine learningThesishttp://purl.org/coar/resource_type/c_db06uuid:7a3e5c33-864f-4cfe-8b80-e85cbf651946Machine Learning and Artificial IntelligenceContinual LearningEnglishHyrax Deposit2020Chaudhry, ADokania, PTorr, PTuytelaars, TPrisacariu, V<p>Deep learning has enjoyed tremendous success over the last decade, but the training of practically useful deep models remains highly inefficient both in terms of the number of weight updates and training samples. To address one aspect of these issues, this thesis studies the continual learning setting whereby a model utilizes a sequence of tasks leveraging previous knowledge to learn new tasks quickly. The main challenge in continual learning is to keep the model from catastrophically forgetting the previous information when updating it for a new task.</p> <p>Towards this, firstly this thesis proposes a continual learning algorithm that preserves previous knowledge by regularizing the KL-divergence between the conditional likelihoods of two successive tasks. It is shown that this regularization imposes a quadratic penalty on the network weights based on the curvature at the minimum of the previous task.</p> <p>Second, this thesis presents a more efficient continual learning algorithm, utilizing an episodic memory of past tasks as a constraint such that the loss of episodic memory does not increase when a weight update is made for a new task. It is shown that using episodic memory to constrain the objective is more effective than regularizing the network parameters. Furthermore, to increase the speed of learning of new tasks, the use of compositional task descriptors using a joint embedding model is proposed that greatly improves the forward transfer.</p> <p>The episodic memory-based continual learning objective is then simplified by directly using the memory in the loss function. Despite its tendency to memorize the data present in the tiny episodic memory, the resulting algorithm is shown to generalize better than the one where memory is used as a constraint. An analysis is proposed that attributes this sur- prising generalization to the regularization effect brought by the data of new tasks.</p> <p>This algorithm is then used to learn continually from synthetic and real data. For this, a method is proposed that generates synthetic data points for each task by optimizing the forgetting loss in hindsight on the replay buffer. A nested optimization objective for continual learning is devised that effectively utilizes these synthetic points to reduce forgetting in memory-based continual learning methods.</p> <p>Finally, this thesis presents a continual learning algorithm that learns different tasks in nonoverlapping feature subspaces. It is shown that minimizing the overlap by keeping the subspaces of different tasks orthogonal to each other reduces the interference between the representations of these tasks.</p>
spellingShingle	Machine Learning and Artificial Intelligence Continual Learning Chaudhry, A Continual learning for efficient machine learning
title	Continual learning for efficient machine learning
title_full	Continual learning for efficient machine learning
title_fullStr	Continual learning for efficient machine learning
title_full_unstemmed	Continual learning for efficient machine learning
title_short	Continual learning for efficient machine learning
title_sort	continual learning for efficient machine learning
topic	Machine Learning and Artificial Intelligence Continual Learning
work_keys_str_mv	AT chaudhrya continuallearningforefficientmachinelearning

Continual learning for efficient machine learning

Documents similaires