On Sequential Bayesian Inference for Continual Learning

Sequential Bayesian inference can be used for <i>continual learning</i> to prevent catastrophic forgetting of past tasks and provide an informative prior when learning new tasks. We revisit sequential Bayesian inference and assess whether using the previous task’s posterior as a prior fo...

Full description

Bibliographic Details
Main Authors: Samuel Kessler, Adam Cobb, Tim G. J. Rudner, Stefan Zohren, Stephen J. Roberts
Format: Article
Language:English
Published: MDPI AG 2023-05-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/25/6/884
_version_ 1797594945492615168
author Samuel Kessler
Adam Cobb
Tim G. J. Rudner
Stefan Zohren
Stephen J. Roberts
author_facet Samuel Kessler
Adam Cobb
Tim G. J. Rudner
Stefan Zohren
Stephen J. Roberts
author_sort Samuel Kessler
collection DOAJ
description Sequential Bayesian inference can be used for <i>continual learning</i> to prevent catastrophic forgetting of past tasks and provide an informative prior when learning new tasks. We revisit sequential Bayesian inference and assess whether using the previous task’s posterior as a prior for a new task can prevent catastrophic forgetting in Bayesian neural networks. Our first contribution is to perform sequential Bayesian inference using Hamiltonian Monte Carlo. We propagate the posterior as a prior for new tasks by approximating the posterior via fitting a density estimator on Hamiltonian Monte Carlo samples. We find that this approach fails to prevent catastrophic forgetting, demonstrating the difficulty in performing sequential Bayesian inference in neural networks. From there, we study simple analytical examples of sequential Bayesian inference and CL and highlight the issue of model misspecification, which can lead to sub-optimal continual learning performance despite exact inference. Furthermore, we discuss how task data imbalances can cause forgetting. From these limitations, we argue that we need probabilistic models of the continual learning generative process rather than relying on sequential Bayesian inference over Bayesian neural network weights. Our final contribution is to propose a simple baseline called <i>Prototypical Bayesian Continual Learning</i>, which is competitive with the best performing Bayesian continual learning methods on class incremental continual learning computer vision benchmarks.
first_indexed 2024-03-11T02:29:49Z
format Article
id doaj.art-8aa5faa53f174985834dac771fb772be
institution Directory Open Access Journal
issn 1099-4300
language English
last_indexed 2024-03-11T02:29:49Z
publishDate 2023-05-01
publisher MDPI AG
record_format Article
series Entropy
spelling doaj.art-8aa5faa53f174985834dac771fb772be2023-11-18T10:17:47ZengMDPI AGEntropy1099-43002023-05-0125688410.3390/e25060884On Sequential Bayesian Inference for Continual LearningSamuel Kessler0Adam Cobb1Tim G. J. Rudner2Stefan Zohren3Stephen J. Roberts4Department of Engineering Science, University of Oxford, Oxford OX2 6ED, UKSRI International, Arlington, VA 22209, USADepartment of Computer Science, University of Oxford, Oxford OX1 3QG, UKDepartment of Engineering Science, University of Oxford, Oxford OX2 6ED, UKDepartment of Engineering Science, University of Oxford, Oxford OX2 6ED, UKSequential Bayesian inference can be used for <i>continual learning</i> to prevent catastrophic forgetting of past tasks and provide an informative prior when learning new tasks. We revisit sequential Bayesian inference and assess whether using the previous task’s posterior as a prior for a new task can prevent catastrophic forgetting in Bayesian neural networks. Our first contribution is to perform sequential Bayesian inference using Hamiltonian Monte Carlo. We propagate the posterior as a prior for new tasks by approximating the posterior via fitting a density estimator on Hamiltonian Monte Carlo samples. We find that this approach fails to prevent catastrophic forgetting, demonstrating the difficulty in performing sequential Bayesian inference in neural networks. From there, we study simple analytical examples of sequential Bayesian inference and CL and highlight the issue of model misspecification, which can lead to sub-optimal continual learning performance despite exact inference. Furthermore, we discuss how task data imbalances can cause forgetting. From these limitations, we argue that we need probabilistic models of the continual learning generative process rather than relying on sequential Bayesian inference over Bayesian neural network weights. Our final contribution is to propose a simple baseline called <i>Prototypical Bayesian Continual Learning</i>, which is competitive with the best performing Bayesian continual learning methods on class incremental continual learning computer vision benchmarks.https://www.mdpi.com/1099-4300/25/6/884continual learninglifelong learningsequential Bayesian inferenceBayesian deep learningBayesian neural networks
spellingShingle Samuel Kessler
Adam Cobb
Tim G. J. Rudner
Stefan Zohren
Stephen J. Roberts
On Sequential Bayesian Inference for Continual Learning
Entropy
continual learning
lifelong learning
sequential Bayesian inference
Bayesian deep learning
Bayesian neural networks
title On Sequential Bayesian Inference for Continual Learning
title_full On Sequential Bayesian Inference for Continual Learning
title_fullStr On Sequential Bayesian Inference for Continual Learning
title_full_unstemmed On Sequential Bayesian Inference for Continual Learning
title_short On Sequential Bayesian Inference for Continual Learning
title_sort on sequential bayesian inference for continual learning
topic continual learning
lifelong learning
sequential Bayesian inference
Bayesian deep learning
Bayesian neural networks
url https://www.mdpi.com/1099-4300/25/6/884
work_keys_str_mv AT samuelkessler onsequentialbayesianinferenceforcontinuallearning
AT adamcobb onsequentialbayesianinferenceforcontinuallearning
AT timgjrudner onsequentialbayesianinferenceforcontinuallearning
AT stefanzohren onsequentialbayesianinferenceforcontinuallearning
AT stephenjroberts onsequentialbayesianinferenceforcontinuallearning