Long-tailed image recognition

The long-tailed distribution problem often poses great challenges to deep learning based computer vision tasks, making the models perform poorly on the balanced test set, particularly for the less frequent classes. With the rapid increase in large-scale deployment of AI solution in industries and th...

Full description

Bibliographic Details
Main Author: Li, Zhaochen
Other Authors: Chen Change Loy
Format: Final Year Project (FYP)
Language:English
Published: Nanyang Technological University 2021
Subjects:
Online Access:https://hdl.handle.net/10356/148026
_version_ 1811685333166718976
author Li, Zhaochen
author2 Chen Change Loy
author_facet Chen Change Loy
Li, Zhaochen
author_sort Li, Zhaochen
collection NTU
description The long-tailed distribution problem often poses great challenges to deep learning based computer vision tasks, making the models perform poorly on the balanced test set, particularly for the less frequent classes. With the rapid increase in large-scale deployment of AI solution in industries and the long-tailed nature of many real-world dataset, it becomes critical to closely examine and address this problem. Traditional class-imbalance solutions in machine learning typically adopt the approach of data re-sampling or loss re-weighting, while recent research in deep learning focus on more sophisticated re-balancing strategy and model architecture modifications. In this paper, we set out to explore the long-tailed problem via two approaches: 1) How Mixup, a commonly used data augmentation technique, could affect the model's performance and how it could be potentially modified. 2) How to construct a memory module and incorporate NCM classifier to help with prediction. We first provide detailed analysis on Mixup under two sampling strategies: Class-Balanced Sampling and Instance-Based Sampling. We find that the original Mixup approach is hyperparameter-sensitive and fails to improve the model performance. We then propose Prior-Aware Mixup which make use of label distribution to govern the pair selection process. For the second approach, we make use of knowledge distillation with a momentum-updated memory module and propose fusion and assignment techniques which outperform several SOTA results on long-tailed benchmark datasets.
first_indexed 2024-10-01T04:42:51Z
format Final Year Project (FYP)
id ntu-10356/148026
institution Nanyang Technological University
language English
last_indexed 2024-10-01T04:42:51Z
publishDate 2021
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1480262021-04-22T05:35:35Z Long-tailed image recognition Li, Zhaochen Chen Change Loy School of Computer Science and Engineering ccloy@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision The long-tailed distribution problem often poses great challenges to deep learning based computer vision tasks, making the models perform poorly on the balanced test set, particularly for the less frequent classes. With the rapid increase in large-scale deployment of AI solution in industries and the long-tailed nature of many real-world dataset, it becomes critical to closely examine and address this problem. Traditional class-imbalance solutions in machine learning typically adopt the approach of data re-sampling or loss re-weighting, while recent research in deep learning focus on more sophisticated re-balancing strategy and model architecture modifications. In this paper, we set out to explore the long-tailed problem via two approaches: 1) How Mixup, a commonly used data augmentation technique, could affect the model's performance and how it could be potentially modified. 2) How to construct a memory module and incorporate NCM classifier to help with prediction. We first provide detailed analysis on Mixup under two sampling strategies: Class-Balanced Sampling and Instance-Based Sampling. We find that the original Mixup approach is hyperparameter-sensitive and fails to improve the model performance. We then propose Prior-Aware Mixup which make use of label distribution to govern the pair selection process. For the second approach, we make use of knowledge distillation with a momentum-updated memory module and propose fusion and assignment techniques which outperform several SOTA results on long-tailed benchmark datasets. Bachelor of Engineering (Computer Science) 2021-04-22T05:35:35Z 2021-04-22T05:35:35Z 2021 Final Year Project (FYP) Li, Z. (2021). Long-tailed image recognition. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148026 https://hdl.handle.net/10356/148026 en SCSE20-0823 application/pdf Nanyang Technological University
spellingShingle Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Li, Zhaochen
Long-tailed image recognition
title Long-tailed image recognition
title_full Long-tailed image recognition
title_fullStr Long-tailed image recognition
title_full_unstemmed Long-tailed image recognition
title_short Long-tailed image recognition
title_sort long tailed image recognition
topic Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
url https://hdl.handle.net/10356/148026
work_keys_str_mv AT lizhaochen longtailedimagerecognition