Image classification with limited data information

Image classification is a fundamental problem in image processing and computer vision. Recent algorithms have achieved significantly better results by learning deep features from large-scale datasets, such as ImageNet. However, in practice, challenges persist, especially with (I) low-quality image d...

Full description

Bibliographic Details
Main Author:	Cheng, Hao
Other Authors:	Wen Bihan
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Engineering Image classification Few-shot learning
Online Access:	https://hdl.handle.net/10356/174167

_version_	1811688191805095936
author	Cheng, Hao
author2	Wen Bihan
author_facet	Wen Bihan Cheng, Hao
author_sort	Cheng, Hao
collection	NTU
description	Image classification is a fundamental problem in image processing and computer vision. Recent algorithms have achieved significantly better results by learning deep features from large-scale datasets, such as ImageNet. However, in practice, challenges persist, especially with (I) low-quality image data, such as noisy data or image data with variations in object appearance, as encountered in image-set classification, and (II) limited availability of image data, including scarce samples, for example, in weakly supervised classification, or restricted availability of labeled data, as seen in few-shot image classification. These tasks require generic and highly flexible models, but also able to avoid over-fitting and failure to generalize when only a few samples are available. This thesis presents three works to tackle image classification tasks with limited data information in weakly supervised and few-shot learning. We begin our study with small-scale visual classification tasks. From a traditional model-based perspective, we introduce a novel method called Joint Statistical and Spatial Sparse (J3S) representation, which reconciles local spatial patch structures and global statistical Gaussian distribution with joint sparsity. Integrating global Gaussian statistical and local spatial patch information through J3S with two joint dictionaries yields more accurate and robust results compared to considering specific information alone. Moving beyond general small-scale classification tasks, then we extend our exploration to a specific task, few-shot image classification. Here, we propose two deep learning-based methods to tackle challenges arising from limited data information. Initially, we focus on Graph Neural Networks (GNN) and investigate the limitations of existing GNN methods for few-shot learning. To address over-fitting and over-smoothing issues observed in recent GNN approaches, we propose the Attentive GNN (AGNN) framework. AGNN incorporates a triple-attention mechanism, facilitating graph initialization, graph update, and correlation across graph layers. We provide both theoretical analysis and practical illustrations to showcase how the proposed modules enhance GNN scalability for few-shot tasks, thereby improving few-shot performance. Subsequently, we explore more generalized and challenging few-shot scenarios, encompassing few-shot domain generalization settings. To address feature distraction caused by class-irrelevant excursive features such as style, domain, and background in image data, we propose a novel Disentangled Feature Representation framework (DFR). DFR effectively removes irrelevant information for classification, thus enhancing performance with class-domain disentanglement. Furthermore, we reorganize a novel dataset called FS-DomainNet based on DomainNet, specifically for benchmarking few-shot domain generalization tasks. The main contributions of this thesis are three folds. Firstly, we conduct a comprehensive study on image classification with limited data from both model-based and deep learning-based perspectives. Secondly, we propose three novel approaches that address various challenges caused by limited data information from different angles. Additionally, we also introduce the FS-DomainNet dataset, specifically designed for evaluating the performance of few-shot methods in more generalized and challenging real-life scenarios. Lastly, we validate the effectiveness of our proposed methods through extensive experiments on multiple benchmarks. Both qualitative and quantitative results demonstrate the improved performance of the proposed approaches for image classification with limited data. The contributions made in this study significantly advance the understanding and practical capabilities of image classification with limited data information and provide essential groundwork for future research in this domain.
first_indexed	2024-10-01T05:28:17Z
format	Thesis-Doctor of Philosophy
id	ntu-10356/174167
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T05:28:17Z
publishDate	2024
publisher	Nanyang Technological University
record_format	dspace
spelling	ntu-10356/1741672024-04-09T03:58:58Z Image classification with limited data information Cheng, Hao Wen Bihan School of Electrical and Electronic Engineering bihan.wen@ntu.edu.sg Engineering Image classification Few-shot learning Image classification is a fundamental problem in image processing and computer vision. Recent algorithms have achieved significantly better results by learning deep features from large-scale datasets, such as ImageNet. However, in practice, challenges persist, especially with (I) low-quality image data, such as noisy data or image data with variations in object appearance, as encountered in image-set classification, and (II) limited availability of image data, including scarce samples, for example, in weakly supervised classification, or restricted availability of labeled data, as seen in few-shot image classification. These tasks require generic and highly flexible models, but also able to avoid over-fitting and failure to generalize when only a few samples are available. This thesis presents three works to tackle image classification tasks with limited data information in weakly supervised and few-shot learning. We begin our study with small-scale visual classification tasks. From a traditional model-based perspective, we introduce a novel method called Joint Statistical and Spatial Sparse (J3S) representation, which reconciles local spatial patch structures and global statistical Gaussian distribution with joint sparsity. Integrating global Gaussian statistical and local spatial patch information through J3S with two joint dictionaries yields more accurate and robust results compared to considering specific information alone. Moving beyond general small-scale classification tasks, then we extend our exploration to a specific task, few-shot image classification. Here, we propose two deep learning-based methods to tackle challenges arising from limited data information. Initially, we focus on Graph Neural Networks (GNN) and investigate the limitations of existing GNN methods for few-shot learning. To address over-fitting and over-smoothing issues observed in recent GNN approaches, we propose the Attentive GNN (AGNN) framework. AGNN incorporates a triple-attention mechanism, facilitating graph initialization, graph update, and correlation across graph layers. We provide both theoretical analysis and practical illustrations to showcase how the proposed modules enhance GNN scalability for few-shot tasks, thereby improving few-shot performance. Subsequently, we explore more generalized and challenging few-shot scenarios, encompassing few-shot domain generalization settings. To address feature distraction caused by class-irrelevant excursive features such as style, domain, and background in image data, we propose a novel Disentangled Feature Representation framework (DFR). DFR effectively removes irrelevant information for classification, thus enhancing performance with class-domain disentanglement. Furthermore, we reorganize a novel dataset called FS-DomainNet based on DomainNet, specifically for benchmarking few-shot domain generalization tasks. The main contributions of this thesis are three folds. Firstly, we conduct a comprehensive study on image classification with limited data from both model-based and deep learning-based perspectives. Secondly, we propose three novel approaches that address various challenges caused by limited data information from different angles. Additionally, we also introduce the FS-DomainNet dataset, specifically designed for evaluating the performance of few-shot methods in more generalized and challenging real-life scenarios. Lastly, we validate the effectiveness of our proposed methods through extensive experiments on multiple benchmarks. Both qualitative and quantitative results demonstrate the improved performance of the proposed approaches for image classification with limited data. The contributions made in this study significantly advance the understanding and practical capabilities of image classification with limited data information and provide essential groundwork for future research in this domain. Doctor of Philosophy 2024-03-18T10:45:01Z 2024-03-18T10:45:01Z 2023 Thesis-Doctor of Philosophy Cheng, H. (2023). Image classification with limited data information. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/174167 https://hdl.handle.net/10356/174167 10.32657/10356/174167 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
spellingShingle	Engineering Image classification Few-shot learning Cheng, Hao Image classification with limited data information
title	Image classification with limited data information
title_full	Image classification with limited data information
title_fullStr	Image classification with limited data information
title_full_unstemmed	Image classification with limited data information
title_short	Image classification with limited data information
title_sort	image classification with limited data information
topic	Engineering Image classification Few-shot learning
url	https://hdl.handle.net/10356/174167
work_keys_str_mv	AT chenghao imageclassificationwithlimiteddatainformation

Image classification with limited data information

Similar Items