The effect of softmax temperature on recent knowledge distillation algorithms

Knowledge distillation is a technique to transfer the knowledge from a large and complex teacher model to a smaller and faster student model, and an important category among methods of model compression. In this study, I survey various knowledge distillation algorithms that have been proposed in re...

Full description

Bibliographic Details
Main Author: Poh, Dominique
Other Authors: Weichen Liu
Format: Final Year Project (FYP)
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/172431
_version_ 1826118092883755008
author Poh, Dominique
author2 Weichen Liu
author_facet Weichen Liu
Poh, Dominique
author_sort Poh, Dominique
collection NTU
description Knowledge distillation is a technique to transfer the knowledge from a large and complex teacher model to a smaller and faster student model, and an important category among methods of model compression. In this study, I survey various knowledge distillation algorithms that have been proposed in recent years and consider their merits in principle, as well as attempt to empirically verify some of the results that have been shown. This study compares their performance on two image classification datasets, CIFAR-10, and CIFAR-100, using ResNet as the teacher and student architectures. I investigate the effect of softmax temperature, a key hyperparameter in knowledge distillation, on the classification accuracy of the student models. The results show that higher temperatures tend to work better for datasets with fewer classes, and vice versa for datasets with more classes. Some algorithms perform better than others depending on the dataset and the temperature.
first_indexed 2024-10-01T04:38:05Z
format Final Year Project (FYP)
id ntu-10356/172431
institution Nanyang Technological University
language English
last_indexed 2024-10-01T04:38:05Z
publishDate 2023
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1724312023-12-15T15:38:03Z The effect of softmax temperature on recent knowledge distillation algorithms Poh, Dominique Weichen Liu School of Computer Science and Engineering liu@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence Knowledge distillation is a technique to transfer the knowledge from a large and complex teacher model to a smaller and faster student model, and an important category among methods of model compression. In this study, I survey various knowledge distillation algorithms that have been proposed in recent years and consider their merits in principle, as well as attempt to empirically verify some of the results that have been shown. This study compares their performance on two image classification datasets, CIFAR-10, and CIFAR-100, using ResNet as the teacher and student architectures. I investigate the effect of softmax temperature, a key hyperparameter in knowledge distillation, on the classification accuracy of the student models. The results show that higher temperatures tend to work better for datasets with fewer classes, and vice versa for datasets with more classes. Some algorithms perform better than others depending on the dataset and the temperature. Bachelor of Engineering (Computer Engineering) 2023-12-11T05:39:36Z 2023-12-11T05:39:36Z 2023 Final Year Project (FYP) Poh, D. (2023). The effect of softmax temperature on recent knowledge distillation algorithms. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/172431 https://hdl.handle.net/10356/172431 en SCSE22-0670 application/pdf Nanyang Technological University
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Poh, Dominique
The effect of softmax temperature on recent knowledge distillation algorithms
title The effect of softmax temperature on recent knowledge distillation algorithms
title_full The effect of softmax temperature on recent knowledge distillation algorithms
title_fullStr The effect of softmax temperature on recent knowledge distillation algorithms
title_full_unstemmed The effect of softmax temperature on recent knowledge distillation algorithms
title_short The effect of softmax temperature on recent knowledge distillation algorithms
title_sort effect of softmax temperature on recent knowledge distillation algorithms
topic Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
url https://hdl.handle.net/10356/172431
work_keys_str_mv AT pohdominique theeffectofsoftmaxtemperatureonrecentknowledgedistillationalgorithms
AT pohdominique effectofsoftmaxtemperatureonrecentknowledgedistillationalgorithms