Long-tailed image recognition
The long-tailed distribution problem often poses great challenges to deep learning based computer vision tasks, making the models perform poorly on the balanced test set, particularly for the less frequent classes. With the rapid increase in large-scale deployment of AI solution in industries and th...
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project (FYP) |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/148026 |
_version_ | 1811685333166718976 |
---|---|
author | Li, Zhaochen |
author2 | Chen Change Loy |
author_facet | Chen Change Loy Li, Zhaochen |
author_sort | Li, Zhaochen |
collection | NTU |
description | The long-tailed distribution problem often poses great challenges to deep learning based computer vision tasks, making the models perform poorly on the balanced test set, particularly for the less frequent classes. With the rapid increase in large-scale deployment of AI solution in industries and the long-tailed nature of many real-world dataset, it becomes critical to closely examine and address this problem. Traditional class-imbalance solutions in machine learning typically adopt the approach of data re-sampling or loss re-weighting, while recent research in deep learning focus on more sophisticated re-balancing strategy and model architecture modifications. In this paper, we set out to explore the long-tailed problem via two approaches: 1) How Mixup, a commonly used data augmentation technique, could affect the model's performance and how it could be potentially modified. 2) How to construct a memory module and incorporate NCM classifier to help with prediction. We first provide detailed analysis on Mixup under two sampling strategies: Class-Balanced Sampling and Instance-Based Sampling. We find that the original Mixup approach is hyperparameter-sensitive and fails to improve the model performance. We then propose Prior-Aware Mixup which make use of label distribution to govern the pair selection process. For the second approach, we make use of knowledge distillation with a momentum-updated memory module and propose fusion and assignment techniques which outperform several SOTA results on long-tailed benchmark datasets. |
first_indexed | 2024-10-01T04:42:51Z |
format | Final Year Project (FYP) |
id | ntu-10356/148026 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T04:42:51Z |
publishDate | 2021 |
publisher | Nanyang Technological University |
record_format | dspace |
spelling | ntu-10356/1480262021-04-22T05:35:35Z Long-tailed image recognition Li, Zhaochen Chen Change Loy School of Computer Science and Engineering ccloy@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision The long-tailed distribution problem often poses great challenges to deep learning based computer vision tasks, making the models perform poorly on the balanced test set, particularly for the less frequent classes. With the rapid increase in large-scale deployment of AI solution in industries and the long-tailed nature of many real-world dataset, it becomes critical to closely examine and address this problem. Traditional class-imbalance solutions in machine learning typically adopt the approach of data re-sampling or loss re-weighting, while recent research in deep learning focus on more sophisticated re-balancing strategy and model architecture modifications. In this paper, we set out to explore the long-tailed problem via two approaches: 1) How Mixup, a commonly used data augmentation technique, could affect the model's performance and how it could be potentially modified. 2) How to construct a memory module and incorporate NCM classifier to help with prediction. We first provide detailed analysis on Mixup under two sampling strategies: Class-Balanced Sampling and Instance-Based Sampling. We find that the original Mixup approach is hyperparameter-sensitive and fails to improve the model performance. We then propose Prior-Aware Mixup which make use of label distribution to govern the pair selection process. For the second approach, we make use of knowledge distillation with a momentum-updated memory module and propose fusion and assignment techniques which outperform several SOTA results on long-tailed benchmark datasets. Bachelor of Engineering (Computer Science) 2021-04-22T05:35:35Z 2021-04-22T05:35:35Z 2021 Final Year Project (FYP) Li, Z. (2021). Long-tailed image recognition. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/148026 https://hdl.handle.net/10356/148026 en SCSE20-0823 application/pdf Nanyang Technological University |
spellingShingle | Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Li, Zhaochen Long-tailed image recognition |
title | Long-tailed image recognition |
title_full | Long-tailed image recognition |
title_fullStr | Long-tailed image recognition |
title_full_unstemmed | Long-tailed image recognition |
title_short | Long-tailed image recognition |
title_sort | long tailed image recognition |
topic | Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision |
url | https://hdl.handle.net/10356/148026 |
work_keys_str_mv | AT lizhaochen longtailedimagerecognition |