Meta-Sim: Learning to Generate Synthetic Datasets

© 2019 IEEE. Training models to high-end performance requires availability of large labeled datasets, which are expensive to get. The goal of our work is to automatically synthesize labeled datasets that are relevant for a downstream task. We propose Meta-Sim, which learns a generative model of synt...

Full description

Bibliographic Details
Format: Article
Language:English
Published: IEEE 2021
Online Access:https://hdl.handle.net/1721.1/137178
_version_ 1826204297796255744
collection MIT
description © 2019 IEEE. Training models to high-end performance requires availability of large labeled datasets, which are expensive to get. The goal of our work is to automatically synthesize labeled datasets that are relevant for a downstream task. We propose Meta-Sim, which learns a generative model of synthetic scenes, and obtain images as well as its corresponding ground-truth via a graphics engine. We parametrize our dataset generator with a neural network, which learns to modify attributes of scene graphs obtained from probabilistic scene grammars, so as to minimize the distribution gap between its rendered outputs and target data. If the real dataset comes with a small labeled validation set, we additionally aim to optimize a meta-objective, i.e. downstream task performance. Experiments show that the proposed method can greatly improve content generation quality over a human-engineered probabilistic scene grammar, both qualitatively and quantitatively as measured by performance on a downstream task.
first_indexed 2024-09-23T12:52:04Z
format Article
id mit-1721.1/137178
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T12:52:04Z
publishDate 2021
publisher IEEE
record_format dspace
spelling mit-1721.1/1371782021-11-04T03:21:25Z Meta-Sim: Learning to Generate Synthetic Datasets © 2019 IEEE. Training models to high-end performance requires availability of large labeled datasets, which are expensive to get. The goal of our work is to automatically synthesize labeled datasets that are relevant for a downstream task. We propose Meta-Sim, which learns a generative model of synthetic scenes, and obtain images as well as its corresponding ground-truth via a graphics engine. We parametrize our dataset generator with a neural network, which learns to modify attributes of scene graphs obtained from probabilistic scene grammars, so as to minimize the distribution gap between its rendered outputs and target data. If the real dataset comes with a small labeled validation set, we additionally aim to optimize a meta-objective, i.e. downstream task performance. Experiments show that the proposed method can greatly improve content generation quality over a human-engineered probabilistic scene grammar, both qualitatively and quantitatively as measured by performance on a downstream task. 2021-11-03T14:07:51Z 2021-11-03T14:07:51Z 2020-10 2021-04-15T17:07:08Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/137178 2020. "Meta-Sim: Learning to Generate Synthetic Datasets." Proceedings of the IEEE International Conference on Computer Vision, 2019-October. en 10.1109/ICCV.2019.00465 Proceedings of the IEEE International Conference on Computer Vision Creative Commons Attribution-Noncommercial-Share Alike http://creativecommons.org/licenses/by-nc-sa/4.0/ application/pdf IEEE arXiv
spellingShingle Meta-Sim: Learning to Generate Synthetic Datasets
title Meta-Sim: Learning to Generate Synthetic Datasets
title_full Meta-Sim: Learning to Generate Synthetic Datasets
title_fullStr Meta-Sim: Learning to Generate Synthetic Datasets
title_full_unstemmed Meta-Sim: Learning to Generate Synthetic Datasets
title_short Meta-Sim: Learning to Generate Synthetic Datasets
title_sort meta sim learning to generate synthetic datasets
url https://hdl.handle.net/1721.1/137178