Continual learning for efficient machine learning

<p>Deep learning has enjoyed tremendous success over the last decade, but the training of practically useful deep models remains highly inefficient both in terms of the number of weight updates and training samples. To address one aspect of these issues, this thesis studies the continual learn...

Description complète

Détails bibliographiques
Auteur principal: Chaudhry, A
Autres auteurs: Dokania, P
Format: Thèse
Langue:English
Publié: 2020
Sujets:
_version_ 1826280466218483712
author Chaudhry, A
author2 Dokania, P
author_facet Dokania, P
Chaudhry, A
author_sort Chaudhry, A
collection OXFORD
description <p>Deep learning has enjoyed tremendous success over the last decade, but the training of practically useful deep models remains highly inefficient both in terms of the number of weight updates and training samples. To address one aspect of these issues, this thesis studies the continual learning setting whereby a model utilizes a sequence of tasks leveraging previous knowledge to learn new tasks quickly. The main challenge in continual learning is to keep the model from catastrophically forgetting the previous information when updating it for a new task.</p> <p>Towards this, firstly this thesis proposes a continual learning algorithm that preserves previous knowledge by regularizing the KL-divergence between the conditional likelihoods of two successive tasks. It is shown that this regularization imposes a quadratic penalty on the network weights based on the curvature at the minimum of the previous task.</p> <p>Second, this thesis presents a more efficient continual learning algorithm, utilizing an episodic memory of past tasks as a constraint such that the loss of episodic memory does not increase when a weight update is made for a new task. It is shown that using episodic memory to constrain the objective is more effective than regularizing the network parameters. Furthermore, to increase the speed of learning of new tasks, the use of compositional task descriptors using a joint embedding model is proposed that greatly improves the forward transfer.</p> <p>The episodic memory-based continual learning objective is then simplified by directly using the memory in the loss function. Despite its tendency to memorize the data present in the tiny episodic memory, the resulting algorithm is shown to generalize better than the one where memory is used as a constraint. An analysis is proposed that attributes this sur- prising generalization to the regularization effect brought by the data of new tasks.</p> <p>This algorithm is then used to learn continually from synthetic and real data. For this, a method is proposed that generates synthetic data points for each task by optimizing the forgetting loss in hindsight on the replay buffer. A nested optimization objective for continual learning is devised that effectively utilizes these synthetic points to reduce forgetting in memory-based continual learning methods.</p> <p>Finally, this thesis presents a continual learning algorithm that learns different tasks in nonoverlapping feature subspaces. It is shown that minimizing the overlap by keeping the subspaces of different tasks orthogonal to each other reduces the interference between the representations of these tasks.</p>
first_indexed 2024-03-07T00:14:08Z
format Thesis
id oxford-uuid:7a3e5c33-864f-4cfe-8b80-e85cbf651946
institution University of Oxford
language English
last_indexed 2024-03-07T00:14:08Z
publishDate 2020
record_format dspace
spelling oxford-uuid:7a3e5c33-864f-4cfe-8b80-e85cbf6519462022-03-26T20:42:44ZContinual learning for efficient machine learningThesishttp://purl.org/coar/resource_type/c_db06uuid:7a3e5c33-864f-4cfe-8b80-e85cbf651946Machine Learning and Artificial IntelligenceContinual LearningEnglishHyrax Deposit2020Chaudhry, ADokania, PTorr, PTuytelaars, TPrisacariu, V<p>Deep learning has enjoyed tremendous success over the last decade, but the training of practically useful deep models remains highly inefficient both in terms of the number of weight updates and training samples. To address one aspect of these issues, this thesis studies the continual learning setting whereby a model utilizes a sequence of tasks leveraging previous knowledge to learn new tasks quickly. The main challenge in continual learning is to keep the model from catastrophically forgetting the previous information when updating it for a new task.</p> <p>Towards this, firstly this thesis proposes a continual learning algorithm that preserves previous knowledge by regularizing the KL-divergence between the conditional likelihoods of two successive tasks. It is shown that this regularization imposes a quadratic penalty on the network weights based on the curvature at the minimum of the previous task.</p> <p>Second, this thesis presents a more efficient continual learning algorithm, utilizing an episodic memory of past tasks as a constraint such that the loss of episodic memory does not increase when a weight update is made for a new task. It is shown that using episodic memory to constrain the objective is more effective than regularizing the network parameters. Furthermore, to increase the speed of learning of new tasks, the use of compositional task descriptors using a joint embedding model is proposed that greatly improves the forward transfer.</p> <p>The episodic memory-based continual learning objective is then simplified by directly using the memory in the loss function. Despite its tendency to memorize the data present in the tiny episodic memory, the resulting algorithm is shown to generalize better than the one where memory is used as a constraint. An analysis is proposed that attributes this sur- prising generalization to the regularization effect brought by the data of new tasks.</p> <p>This algorithm is then used to learn continually from synthetic and real data. For this, a method is proposed that generates synthetic data points for each task by optimizing the forgetting loss in hindsight on the replay buffer. A nested optimization objective for continual learning is devised that effectively utilizes these synthetic points to reduce forgetting in memory-based continual learning methods.</p> <p>Finally, this thesis presents a continual learning algorithm that learns different tasks in nonoverlapping feature subspaces. It is shown that minimizing the overlap by keeping the subspaces of different tasks orthogonal to each other reduces the interference between the representations of these tasks.</p>
spellingShingle Machine Learning and Artificial Intelligence
Continual Learning
Chaudhry, A
Continual learning for efficient machine learning
title Continual learning for efficient machine learning
title_full Continual learning for efficient machine learning
title_fullStr Continual learning for efficient machine learning
title_full_unstemmed Continual learning for efficient machine learning
title_short Continual learning for efficient machine learning
title_sort continual learning for efficient machine learning
topic Machine Learning and Artificial Intelligence
Continual Learning
work_keys_str_mv AT chaudhrya continuallearningforefficientmachinelearning