Does Formula-Driven Supervised Learning Work on Small Datasets?

Does formula-driven supervised learning (FDSL) work effectively with fine-tuning on small datasets? Additionally, how many natural images do a network pre-trained with FDSL require to acquire sufficient image features? FDSL is a pre-training method that employs mathematical formulas to automatically...

Full description

Bibliographic Details
Main Authors: Kodai Nakashima, Hirokatsu Kataoka, Yutaka Satoh
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10266324/
Description
Summary:Does formula-driven supervised learning (FDSL) work effectively with fine-tuning on small datasets? Additionally, how many natural images do a network pre-trained with FDSL require to acquire sufficient image features? FDSL is a pre-training method that employs mathematical formulas to automatically generate images and their corresponding labels. These questions are crucial to address, as the acquisition of features valuable for natural image recognition tasks necessitates the opportunity to learn a certain number of natural images through pre-training and fine-tuning to achieve optimal results. Furthermore, because FDSL is progressively gaining attention as a promising method to mitigate concerns about privacy violations, fairness protection, and labor-intensive efforts associated with annotating natural images, clarifying its effectiveness and limitations is essential for widespread adoption. In this study, we compare FDSL with ImageNet-1k pre-training and training from scratch through fine-tuning on datasets of the order of 100 to 10,000 images. Through our experiments, we discovered that (i) there is a significant difference from ImageNet-1k pre-training when using datasets containing approximately 100 to 1,000 images, and (ii) approximately 50,000 images are required for FDSL to be equivalent to ImageNet-1k pre-training. Moreover, we verified the validity of the hyper-parameters during fine-tuning. We firmly believe that this study elucidates the current limitations of FDSL and offers valuable guidance for future research, ultimately contributing to the field of computer vision.
ISSN:2169-3536