Towards interpretable deep local learning with successive gradient reconciliation

Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic due to the biological implausibility and huge memory consumption caused by BP. Among the existing solutions, local learning optimizes gradient-isolated modules of a neural netw...

Celý popis

Podrobná bibliografie
Hlavní autoři: Yang, Y, Li, X, Alfarra, M, Hammoud, H, Bibi, A, Torr, P, Ghanem, B
Médium: Conference item
Jazyk:English
Vydáno: PMLR 2024
_version_ 1826316090413678592
author Yang, Y
Li, X
Alfarra, M
Hammoud, H
Bibi, A
Torr, P
Ghanem, B
author_facet Yang, Y
Li, X
Alfarra, M
Hammoud, H
Bibi, A
Torr, P
Ghanem, B
author_sort Yang, Y
collection OXFORD
description Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic due to the biological implausibility and huge memory consumption caused by BP. Among the existing solutions, local learning optimizes gradient-isolated modules of a neural network with local errors and has been proved to be effective even on large-scale datasets. However, the reconciliation among local errors has never been investigated. In this paper, we first theoretically study non-greedy layer-wise training and show that the convergence cannot be assured when the local gradient in a module w.r.t. its input is not reconciled with the local gradient in the previous module w.r.t. its output. Inspired by the theoretical result, we further propose a local training strategy that successively regularizes the gradient reconciliation between neighboring modules without breaking gradient isolation or introducing any learnable parameters. Our method can be integrated into both local-BP and BP-free settings. In experiments, we achieve significant performance improvements compared to previous methods. Particularly, our method for CNN and Transformer architectures on ImageNet is able to attain a competitive performance with global BP, saving more than 40% memory consumption.
first_indexed 2024-12-09T03:37:35Z
format Conference item
id oxford-uuid:a00caa99-e242-4a85-93b5-90890d1662c5
institution University of Oxford
language English
last_indexed 2024-12-09T03:37:35Z
publishDate 2024
publisher PMLR
record_format dspace
spelling oxford-uuid:a00caa99-e242-4a85-93b5-90890d1662c52024-12-02T15:20:15ZTowards interpretable deep local learning with successive gradient reconciliationConference itemhttp://purl.org/coar/resource_type/c_5794uuid:a00caa99-e242-4a85-93b5-90890d1662c5EnglishSymplectic ElementsPMLR2024Yang, YLi, XAlfarra, MHammoud, HBibi, ATorr, PGhanem, BRelieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic due to the biological implausibility and huge memory consumption caused by BP. Among the existing solutions, local learning optimizes gradient-isolated modules of a neural network with local errors and has been proved to be effective even on large-scale datasets. However, the reconciliation among local errors has never been investigated. In this paper, we first theoretically study non-greedy layer-wise training and show that the convergence cannot be assured when the local gradient in a module w.r.t. its input is not reconciled with the local gradient in the previous module w.r.t. its output. Inspired by the theoretical result, we further propose a local training strategy that successively regularizes the gradient reconciliation between neighboring modules without breaking gradient isolation or introducing any learnable parameters. Our method can be integrated into both local-BP and BP-free settings. In experiments, we achieve significant performance improvements compared to previous methods. Particularly, our method for CNN and Transformer architectures on ImageNet is able to attain a competitive performance with global BP, saving more than 40% memory consumption.
spellingShingle Yang, Y
Li, X
Alfarra, M
Hammoud, H
Bibi, A
Torr, P
Ghanem, B
Towards interpretable deep local learning with successive gradient reconciliation
title Towards interpretable deep local learning with successive gradient reconciliation
title_full Towards interpretable deep local learning with successive gradient reconciliation
title_fullStr Towards interpretable deep local learning with successive gradient reconciliation
title_full_unstemmed Towards interpretable deep local learning with successive gradient reconciliation
title_short Towards interpretable deep local learning with successive gradient reconciliation
title_sort towards interpretable deep local learning with successive gradient reconciliation
work_keys_str_mv AT yangy towardsinterpretabledeeplocallearningwithsuccessivegradientreconciliation
AT lix towardsinterpretabledeeplocallearningwithsuccessivegradientreconciliation
AT alfarram towardsinterpretabledeeplocallearningwithsuccessivegradientreconciliation
AT hammoudh towardsinterpretabledeeplocallearningwithsuccessivegradientreconciliation
AT bibia towardsinterpretabledeeplocallearningwithsuccessivegradientreconciliation
AT torrp towardsinterpretabledeeplocallearningwithsuccessivegradientreconciliation
AT ghanemb towardsinterpretabledeeplocallearningwithsuccessivegradientreconciliation