Domain randomization for neural network classification

Abstract Large data requirements are often the main hurdle in training neural networks. Convolutional neural network (CNN) classifiers in particular require tens of thousands of pre-labeled images per category to approach human-level accuracy, while often failing to generalized to out-of-domain test...

Full description

Bibliographic Details
Main Authors: Svetozar Zarko Valtchev, Jianhong Wu
Format: Article
Language:English
Published: SpringerOpen 2021-07-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-021-00455-5
_version_ 1819145551600418816
author Svetozar Zarko Valtchev
Jianhong Wu
author_facet Svetozar Zarko Valtchev
Jianhong Wu
author_sort Svetozar Zarko Valtchev
collection DOAJ
description Abstract Large data requirements are often the main hurdle in training neural networks. Convolutional neural network (CNN) classifiers in particular require tens of thousands of pre-labeled images per category to approach human-level accuracy, while often failing to generalized to out-of-domain test sets. The acquisition and labelling of such datasets is often an expensive, time consuming and tedious task in practice. Synthetic data provides a cheap and efficient solution to assemble such large datasets. Using domain randomization (DR), we show that a sufficiently well generated synthetic image dataset can be used to train a neural network classifier that rivals state-of-the-art models trained on real datasets, achieving accuracy levels as high as 88% on a baseline cats vs dogs classification task. We show that the most important domain randomization parameter is a large variety of subjects, while secondary parameters such as lighting and textures are found to be less significant to the model accuracy. Our results also provide evidence to suggest that models trained on domain randomized images transfer to new domains better than those trained on real photos. Model performance appears to remain stable as the number of categories increases.
first_indexed 2024-12-22T12:59:50Z
format Article
id doaj.art-ff752bb0d9584aa2b2052aef2bd5fafa
institution Directory Open Access Journal
issn 2196-1115
language English
last_indexed 2024-12-22T12:59:50Z
publishDate 2021-07-01
publisher SpringerOpen
record_format Article
series Journal of Big Data
spelling doaj.art-ff752bb0d9584aa2b2052aef2bd5fafa2022-12-21T18:25:01ZengSpringerOpenJournal of Big Data2196-11152021-07-018111210.1186/s40537-021-00455-5Domain randomization for neural network classificationSvetozar Zarko Valtchev0Jianhong Wu1Laboratory of Industrial and Applied Mathematics, York UniversityLaboratory of Industrial and Applied Mathematics, York UniversityAbstract Large data requirements are often the main hurdle in training neural networks. Convolutional neural network (CNN) classifiers in particular require tens of thousands of pre-labeled images per category to approach human-level accuracy, while often failing to generalized to out-of-domain test sets. The acquisition and labelling of such datasets is often an expensive, time consuming and tedious task in practice. Synthetic data provides a cheap and efficient solution to assemble such large datasets. Using domain randomization (DR), we show that a sufficiently well generated synthetic image dataset can be used to train a neural network classifier that rivals state-of-the-art models trained on real datasets, achieving accuracy levels as high as 88% on a baseline cats vs dogs classification task. We show that the most important domain randomization parameter is a large variety of subjects, while secondary parameters such as lighting and textures are found to be less significant to the model accuracy. Our results also provide evidence to suggest that models trained on domain randomized images transfer to new domains better than those trained on real photos. Model performance appears to remain stable as the number of categories increases.https://doi.org/10.1186/s40537-021-00455-5Domain randomizationSynthetic image generationNeural network classifiers
spellingShingle Svetozar Zarko Valtchev
Jianhong Wu
Domain randomization for neural network classification
Journal of Big Data
Domain randomization
Synthetic image generation
Neural network classifiers
title Domain randomization for neural network classification
title_full Domain randomization for neural network classification
title_fullStr Domain randomization for neural network classification
title_full_unstemmed Domain randomization for neural network classification
title_short Domain randomization for neural network classification
title_sort domain randomization for neural network classification
topic Domain randomization
Synthetic image generation
Neural network classifiers
url https://doi.org/10.1186/s40537-021-00455-5
work_keys_str_mv AT svetozarzarkovaltchev domainrandomizationforneuralnetworkclassification
AT jianhongwu domainrandomizationforneuralnetworkclassification