On sequential Bayesian inference for continual learning

Sequential Bayesian inference can be used for continual learning to prevent catastrophic forgetting of past tasks and provide an informative prior when learning new tasks. We revisit sequential Bayesian inference and assess whether using the previous task’s posterior as a prior for a new task can pr...

Full description

Bibliographic Details
Main Authors: Kessler, S, Cobb, A, Rudner, TGJ, Zohren, S, Roberts, SJ
Format: Journal article
Language:English
Published: MDPI 2023
_version_ 1826310881221279744
author Kessler, S
Cobb, A
Rudner, TGJ
Zohren, S
Roberts, SJ
Roberts, SJ
author_facet Kessler, S
Cobb, A
Rudner, TGJ
Zohren, S
Roberts, SJ
Roberts, SJ
author_sort Kessler, S
collection OXFORD
description Sequential Bayesian inference can be used for continual learning to prevent catastrophic forgetting of past tasks and provide an informative prior when learning new tasks. We revisit sequential Bayesian inference and assess whether using the previous task’s posterior as a prior for a new task can prevent catastrophic forgetting in Bayesian neural networks. Our first contribution is to perform sequential Bayesian inference using Hamiltonian Monte Carlo. We propagate the posterior as a prior for new tasks by approximating the posterior via fitting a density estimator on Hamiltonian Monte Carlo samples. We find that this approach fails to prevent catastrophic forgetting, demonstrating the difficulty in performing sequential Bayesian inference in neural networks. From there, we study simple analytical examples of sequential Bayesian inference and CL and highlight the issue of model misspecification, which can lead to sub-optimal continual learning performance despite exact inference. Furthermore, we discuss how task data imbalances can cause forgetting. From these limitations, we argue that we need probabilistic models of the continual learning generative process rather than relying on sequential Bayesian inference over Bayesian neural network weights. Our final contribution is to propose a simple baseline called Prototypical Bayesian Continual Learning, which is competitive with the best performing Bayesian continual learning methods on class incremental continual learning computer vision benchmarks.
first_indexed 2024-03-07T08:00:02Z
format Journal article
id oxford-uuid:1e1ac507-d725-4db5-a5bf-fb46fd4c501f
institution University of Oxford
language English
last_indexed 2024-03-07T08:00:02Z
publishDate 2023
publisher MDPI
record_format dspace
spelling oxford-uuid:1e1ac507-d725-4db5-a5bf-fb46fd4c501f2023-09-18T15:44:59ZOn sequential Bayesian inference for continual learningJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:1e1ac507-d725-4db5-a5bf-fb46fd4c501fEnglishSymplectic ElementsMDPI2023Kessler, SCobb, ARudner, TGJZohren, SRoberts, SJRoberts, SJSequential Bayesian inference can be used for continual learning to prevent catastrophic forgetting of past tasks and provide an informative prior when learning new tasks. We revisit sequential Bayesian inference and assess whether using the previous task’s posterior as a prior for a new task can prevent catastrophic forgetting in Bayesian neural networks. Our first contribution is to perform sequential Bayesian inference using Hamiltonian Monte Carlo. We propagate the posterior as a prior for new tasks by approximating the posterior via fitting a density estimator on Hamiltonian Monte Carlo samples. We find that this approach fails to prevent catastrophic forgetting, demonstrating the difficulty in performing sequential Bayesian inference in neural networks. From there, we study simple analytical examples of sequential Bayesian inference and CL and highlight the issue of model misspecification, which can lead to sub-optimal continual learning performance despite exact inference. Furthermore, we discuss how task data imbalances can cause forgetting. From these limitations, we argue that we need probabilistic models of the continual learning generative process rather than relying on sequential Bayesian inference over Bayesian neural network weights. Our final contribution is to propose a simple baseline called Prototypical Bayesian Continual Learning, which is competitive with the best performing Bayesian continual learning methods on class incremental continual learning computer vision benchmarks.
spellingShingle Kessler, S
Cobb, A
Rudner, TGJ
Zohren, S
Roberts, SJ
Roberts, SJ
On sequential Bayesian inference for continual learning
title On sequential Bayesian inference for continual learning
title_full On sequential Bayesian inference for continual learning
title_fullStr On sequential Bayesian inference for continual learning
title_full_unstemmed On sequential Bayesian inference for continual learning
title_short On sequential Bayesian inference for continual learning
title_sort on sequential bayesian inference for continual learning
work_keys_str_mv AT kesslers onsequentialbayesianinferenceforcontinuallearning
AT cobba onsequentialbayesianinferenceforcontinuallearning
AT rudnertgj onsequentialbayesianinferenceforcontinuallearning
AT zohrens onsequentialbayesianinferenceforcontinuallearning
AT robertssj onsequentialbayesianinferenceforcontinuallearning
AT robertssj onsequentialbayesianinferenceforcontinuallearning