Mimicking the oracle: an initial phase decorrelation approach for class incremental learning

Class Incremental Learning (CIL) aims at learning a classifier in a phase-by-phase manner, in which only data of a subset of the classes are provided at each phase. Previous works mainly focus on mitigating forgetting in phases after the initial one. However, we find that improving CIL at its initia...

Full description

Bibliographic Details
Main Authors: Shi, Y, Zhou, K, Liang, J, Jiang, Z, Feng, J, Torr, P, Bai, S, Tan, VYF
Format: Conference item
Language:English
Published: IEEE 2022
_version_ 1797107820883083264
author Shi, Y
Zhou, K
Liang, J
Jiang, Z
Feng, J
Torr, P
Bai, S
Tan, VYF
author_facet Shi, Y
Zhou, K
Liang, J
Jiang, Z
Feng, J
Torr, P
Bai, S
Tan, VYF
author_sort Shi, Y
collection OXFORD
description Class Incremental Learning (CIL) aims at learning a classifier in a phase-by-phase manner, in which only data of a subset of the classes are provided at each phase. Previous works mainly focus on mitigating forgetting in phases after the initial one. However, we find that improving CIL at its initial phase is also a promising direction. Specifically, we experimentally show that directly encouraging CIL Learner at the initial phase to output similar representations as the model jointly trained on all classes can greatly boost the CIL performance. Motivated by this, we study the difference between a na¨ıvely-trained initial-phase model and the oracle model. Specifically, since one major difference between these two models is the number of training classes, we investigate how such difference affects the model representations. We find that, with fewer training classes, the data representations of each class lie in a long and narrow region; with more training classes, the representations of each class scatter more uniformly. Inspired by this observation, we propose Class-wise Decorrelation (CwD) that effectively regularizes representations of each class to scatter more uniformly, thus mimicking the model jointly trained with all classes (i.e., the oracle model). Our CwD is simple to implement and easy to plug into existing methods. Extensive experiments on various benchmark datasets show that CwD consistently and significantly improves the performance of existing state-of-the-art methods by around 1% to 3%.
first_indexed 2024-03-07T07:19:33Z
format Conference item
id oxford-uuid:3feb80ee-62e5-49d9-9438-cad0b06434ec
institution University of Oxford
language English
last_indexed 2024-03-07T07:19:33Z
publishDate 2022
publisher IEEE
record_format dspace
spelling oxford-uuid:3feb80ee-62e5-49d9-9438-cad0b06434ec2022-10-04T10:17:09ZMimicking the oracle: an initial phase decorrelation approach for class incremental learningConference itemhttp://purl.org/coar/resource_type/c_5794uuid:3feb80ee-62e5-49d9-9438-cad0b06434ecEnglishSymplectic ElementsIEEE2022Shi, YZhou, KLiang, JJiang, ZFeng, JTorr, PBai, STan, VYFClass Incremental Learning (CIL) aims at learning a classifier in a phase-by-phase manner, in which only data of a subset of the classes are provided at each phase. Previous works mainly focus on mitigating forgetting in phases after the initial one. However, we find that improving CIL at its initial phase is also a promising direction. Specifically, we experimentally show that directly encouraging CIL Learner at the initial phase to output similar representations as the model jointly trained on all classes can greatly boost the CIL performance. Motivated by this, we study the difference between a na¨ıvely-trained initial-phase model and the oracle model. Specifically, since one major difference between these two models is the number of training classes, we investigate how such difference affects the model representations. We find that, with fewer training classes, the data representations of each class lie in a long and narrow region; with more training classes, the representations of each class scatter more uniformly. Inspired by this observation, we propose Class-wise Decorrelation (CwD) that effectively regularizes representations of each class to scatter more uniformly, thus mimicking the model jointly trained with all classes (i.e., the oracle model). Our CwD is simple to implement and easy to plug into existing methods. Extensive experiments on various benchmark datasets show that CwD consistently and significantly improves the performance of existing state-of-the-art methods by around 1% to 3%.
spellingShingle Shi, Y
Zhou, K
Liang, J
Jiang, Z
Feng, J
Torr, P
Bai, S
Tan, VYF
Mimicking the oracle: an initial phase decorrelation approach for class incremental learning
title Mimicking the oracle: an initial phase decorrelation approach for class incremental learning
title_full Mimicking the oracle: an initial phase decorrelation approach for class incremental learning
title_fullStr Mimicking the oracle: an initial phase decorrelation approach for class incremental learning
title_full_unstemmed Mimicking the oracle: an initial phase decorrelation approach for class incremental learning
title_short Mimicking the oracle: an initial phase decorrelation approach for class incremental learning
title_sort mimicking the oracle an initial phase decorrelation approach for class incremental learning
work_keys_str_mv AT shiy mimickingtheoracleaninitialphasedecorrelationapproachforclassincrementallearning
AT zhouk mimickingtheoracleaninitialphasedecorrelationapproachforclassincrementallearning
AT liangj mimickingtheoracleaninitialphasedecorrelationapproachforclassincrementallearning
AT jiangz mimickingtheoracleaninitialphasedecorrelationapproachforclassincrementallearning
AT fengj mimickingtheoracleaninitialphasedecorrelationapproachforclassincrementallearning
AT torrp mimickingtheoracleaninitialphasedecorrelationapproachforclassincrementallearning
AT bais mimickingtheoracleaninitialphasedecorrelationapproachforclassincrementallearning
AT tanvyf mimickingtheoracleaninitialphasedecorrelationapproachforclassincrementallearning