AutoSimulate: (Quickly) learning synthetic data generation

Simulation is increasingly being used for generating large labelled datasets in many machine learning problems. Recent methods have focused on adjusting simulator parameters with the goal of maximising accuracy on a validation task, usually relying on REINFORCE-like gradient estimators. However thes...

Full description

Bibliographic Details
Main Authors:	Behl, HS, Baydin, AG, Gal, R, Torr, PHS, Vineet, V
Format:	Conference item
Language:	English
Published:	Springer 2020

_version_	1797080050118426624
author	Behl, HS Baydin, AG Gal, R Torr, PHS Vineet, V
author_facet	Behl, HS Baydin, AG Gal, R Torr, PHS Vineet, V
author_sort	Behl, HS
collection	OXFORD
description	Simulation is increasingly being used for generating large labelled datasets in many machine learning problems. Recent methods have focused on adjusting simulator parameters with the goal of maximising accuracy on a validation task, usually relying on REINFORCE-like gradient estimators. However these approaches are very expensive as they treat the entire data generation, model training, and validation pipeline as a black-box and require multiple costly objective evaluations at each iteration. We propose an efficient alternative for optimal synthetic data generation, based on a novel differentiable approximation of the objective. This allows us to optimize the simulator, which may be non-differentiable, requiring only one objective evaluation at each iteration with a little overhead. We demonstrate on a state-of-the-art photorealistic renderer that the proposed method finds the optimal data distribution faster (up to 50×), with significantly reduced training data generation and better accuracy on real-world test datasets than previous methods.
first_indexed	2024-03-07T00:54:34Z
format	Conference item
id	oxford-uuid:87962429-2603-446b-8ca2-afdf055a0088
institution	University of Oxford
language	English
last_indexed	2024-03-07T00:54:34Z
publishDate	2020
publisher	Springer
record_format	dspace
spelling	oxford-uuid:87962429-2603-446b-8ca2-afdf055a00882022-03-26T22:11:45ZAutoSimulate: (Quickly) learning synthetic data generationConference itemhttp://purl.org/coar/resource_type/c_5794uuid:87962429-2603-446b-8ca2-afdf055a0088EnglishSymplectic ElementsSpringer2020Behl, HSBaydin, AGGal, RTorr, PHSVineet, VSimulation is increasingly being used for generating large labelled datasets in many machine learning problems. Recent methods have focused on adjusting simulator parameters with the goal of maximising accuracy on a validation task, usually relying on REINFORCE-like gradient estimators. However these approaches are very expensive as they treat the entire data generation, model training, and validation pipeline as a black-box and require multiple costly objective evaluations at each iteration. We propose an efficient alternative for optimal synthetic data generation, based on a novel differentiable approximation of the objective. This allows us to optimize the simulator, which may be non-differentiable, requiring only one objective evaluation at each iteration with a little overhead. We demonstrate on a state-of-the-art photorealistic renderer that the proposed method finds the optimal data distribution faster (up to 50×), with significantly reduced training data generation and better accuracy on real-world test datasets than previous methods.
spellingShingle	Behl, HS Baydin, AG Gal, R Torr, PHS Vineet, V AutoSimulate: (Quickly) learning synthetic data generation
title	AutoSimulate: (Quickly) learning synthetic data generation
title_full	AutoSimulate: (Quickly) learning synthetic data generation
title_fullStr	AutoSimulate: (Quickly) learning synthetic data generation
title_full_unstemmed	AutoSimulate: (Quickly) learning synthetic data generation
title_short	AutoSimulate: (Quickly) learning synthetic data generation
title_sort	autosimulate quickly learning synthetic data generation
work_keys_str_mv	AT behlhs autosimulatequicklylearningsyntheticdatageneration AT baydinag autosimulatequicklylearningsyntheticdatageneration AT galr autosimulatequicklylearningsyntheticdatageneration AT torrphs autosimulatequicklylearningsyntheticdatageneration AT vineetv autosimulatequicklylearningsyntheticdatageneration

AutoSimulate: (Quickly) learning synthetic data generation

Similar Items