On Performance and Calibration of Natural Gradient Langevin Dynamics
Producing deep neural network (DNN) models with calibrated confidence is essential for applications in many fields, such as medical image analysis, natural language processing, and robotics. Modern neural networks have been reported to be poorly calibrated compared with those from a decade ago. The...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10131934/ |
_version_ | 1797808506544324608 |
---|---|
author | Hanif Amal Robbani Alhadi Bustamam Risman Adnan Shandar Ahmad |
author_facet | Hanif Amal Robbani Alhadi Bustamam Risman Adnan Shandar Ahmad |
author_sort | Hanif Amal Robbani |
collection | DOAJ |
description | Producing deep neural network (DNN) models with calibrated confidence is essential for applications in many fields, such as medical image analysis, natural language processing, and robotics. Modern neural networks have been reported to be poorly calibrated compared with those from a decade ago. The stochastic gradient Langevin dynamics (SGLD) algorithm offers a tractable approximate Bayesian inference applicable to DNN, providing a principled method for learning the uncertainty. A recent benchmark study showed that SGLD could produce a more robust model to covariate shifts than other competing methods. However, vanilla SGLD is also known to be slow, and preconditioning can improve SGLD efficacy. This paper proposes eigenvalue-corrected Kronecker factorization (EKFAC) preconditioned SGLD (EKSGLD), in which a novel second-order gradient approximation is employed as a preconditioner for the SGLD algorithm. This approach is expected to bring together the advantages of both second-order optimization and the approximate Bayesian method. Experiments were conducted to compare the performance of EKSGLD with existing preconditioning methods and showed that it could achieve higher predictive accuracy and better calibration on the validation set. EKSGLD improved the best accuracy by 3.06% on CIFAR-10 and 4.15% on MNIST, improved the best negative log-likelihood by 16.2% on CIFAR-10 and 11.4% on MNIST, and improved the best thresholded adaptive calibration error by 4.05% on CIFAR-10. |
first_indexed | 2024-03-13T06:38:33Z |
format | Article |
id | doaj.art-e0fa5f58cd434d92b51c51f9ab7dae8c |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-03-13T06:38:33Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-e0fa5f58cd434d92b51c51f9ab7dae8c2023-06-08T23:01:26ZengIEEEIEEE Access2169-35362023-01-0111539195393110.1109/ACCESS.2023.327912410131934On Performance and Calibration of Natural Gradient Langevin DynamicsHanif Amal Robbani0https://orcid.org/0000-0002-1510-6559Alhadi Bustamam1https://orcid.org/0000-0002-7408-074XRisman Adnan2Shandar Ahmad3https://orcid.org/0000-0002-7287-305XDepartment of Mathematics, Universitas Indonesia, Depok, IndonesiaDepartment of Mathematics, Universitas Indonesia, Depok, IndonesiaDepartment of Mathematics, Universitas Indonesia, Depok, IndonesiaSchool of Computational and Integrative Sciences, Jawaharlal Nehru University, Delhi, IndiaProducing deep neural network (DNN) models with calibrated confidence is essential for applications in many fields, such as medical image analysis, natural language processing, and robotics. Modern neural networks have been reported to be poorly calibrated compared with those from a decade ago. The stochastic gradient Langevin dynamics (SGLD) algorithm offers a tractable approximate Bayesian inference applicable to DNN, providing a principled method for learning the uncertainty. A recent benchmark study showed that SGLD could produce a more robust model to covariate shifts than other competing methods. However, vanilla SGLD is also known to be slow, and preconditioning can improve SGLD efficacy. This paper proposes eigenvalue-corrected Kronecker factorization (EKFAC) preconditioned SGLD (EKSGLD), in which a novel second-order gradient approximation is employed as a preconditioner for the SGLD algorithm. This approach is expected to bring together the advantages of both second-order optimization and the approximate Bayesian method. Experiments were conducted to compare the performance of EKSGLD with existing preconditioning methods and showed that it could achieve higher predictive accuracy and better calibration on the validation set. EKSGLD improved the best accuracy by 3.06% on CIFAR-10 and 4.15% on MNIST, improved the best negative log-likelihood by 16.2% on CIFAR-10 and 11.4% on MNIST, and improved the best thresholded adaptive calibration error by 4.05% on CIFAR-10.https://ieeexplore.ieee.org/document/10131934/Natural gradientsecond-order optimizationBayesian deep learningLangevin dynamicsconfidence calibrationpredictive uncertainty |
spellingShingle | Hanif Amal Robbani Alhadi Bustamam Risman Adnan Shandar Ahmad On Performance and Calibration of Natural Gradient Langevin Dynamics IEEE Access Natural gradient second-order optimization Bayesian deep learning Langevin dynamics confidence calibration predictive uncertainty |
title | On Performance and Calibration of Natural Gradient Langevin Dynamics |
title_full | On Performance and Calibration of Natural Gradient Langevin Dynamics |
title_fullStr | On Performance and Calibration of Natural Gradient Langevin Dynamics |
title_full_unstemmed | On Performance and Calibration of Natural Gradient Langevin Dynamics |
title_short | On Performance and Calibration of Natural Gradient Langevin Dynamics |
title_sort | on performance and calibration of natural gradient langevin dynamics |
topic | Natural gradient second-order optimization Bayesian deep learning Langevin dynamics confidence calibration predictive uncertainty |
url | https://ieeexplore.ieee.org/document/10131934/ |
work_keys_str_mv | AT hanifamalrobbani onperformanceandcalibrationofnaturalgradientlangevindynamics AT alhadibustamam onperformanceandcalibrationofnaturalgradientlangevindynamics AT rismanadnan onperformanceandcalibrationofnaturalgradientlangevindynamics AT shandarahmad onperformanceandcalibrationofnaturalgradientlangevindynamics |