A Computational Model for Combinatorial Generalization in Physical Perception from Sound

Humans possess the unique ability of combinatorial generalization in auditory perception: given novel auditory stimuli, humans perform auditory scene analysis and infer causal physical interactions based on prior knowledge. Could we build a computational model that achieves human-like combinatorial...

Full description

Bibliographic Details
Main Authors: Wang, Yunyun, Gan, Chuang, Siegel, Max, Zhang, Zhoutong, Wu, Jiajun, Tenenbaum, Joshua
Format: Article
Language:English
Published: Cognitive Computational Neuroscience 2021
Online Access:https://hdl.handle.net/1721.1/138340
_version_ 1826191424319651840
author Wang, Yunyun
Gan, Chuang
Siegel, Max
Zhang, Zhoutong
Wu, Jiajun
Tenenbaum, Joshua
author_facet Wang, Yunyun
Gan, Chuang
Siegel, Max
Zhang, Zhoutong
Wu, Jiajun
Tenenbaum, Joshua
author_sort Wang, Yunyun
collection MIT
description Humans possess the unique ability of combinatorial generalization in auditory perception: given novel auditory stimuli, humans perform auditory scene analysis and infer causal physical interactions based on prior knowledge. Could we build a computational model that achieves human-like combinatorial generalization? In this paper, we present a case study on box-shaking: having heard only the sound of a single ball moving in a box, we seek to interpret the sound of two or three balls of different materials. To solve this task, we propose a hybrid model with two components: a neural network for perception, and a physical audio engine for simulation. We use the outcome of the network as an initial guess and perform MCMC sampling with the audio engine to improve the result. Combining neural networks with a physical audio engine, our hybrid model achieves combinatorial generalization efficiently and accurately in auditory scene perception.
first_indexed 2024-09-23T08:55:59Z
format Article
id mit-1721.1/138340
institution Massachusetts Institute of Technology
language English
last_indexed 2024-09-23T08:55:59Z
publishDate 2021
publisher Cognitive Computational Neuroscience
record_format dspace
spelling mit-1721.1/1383402021-12-08T03:35:30Z A Computational Model for Combinatorial Generalization in Physical Perception from Sound Wang, Yunyun Gan, Chuang Siegel, Max Zhang, Zhoutong Wu, Jiajun Tenenbaum, Joshua Humans possess the unique ability of combinatorial generalization in auditory perception: given novel auditory stimuli, humans perform auditory scene analysis and infer causal physical interactions based on prior knowledge. Could we build a computational model that achieves human-like combinatorial generalization? In this paper, we present a case study on box-shaking: having heard only the sound of a single ball moving in a box, we seek to interpret the sound of two or three balls of different materials. To solve this task, we propose a hybrid model with two components: a neural network for perception, and a physical audio engine for simulation. We use the outcome of the network as an initial guess and perform MCMC sampling with the audio engine to improve the result. Combining neural networks with a physical audio engine, our hybrid model achieves combinatorial generalization efficiently and accurately in auditory scene perception. 2021-12-07T13:44:38Z 2021-12-07T13:44:38Z 2019 2021-12-07T13:39:22Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/138340 Wang, Yunyun, Gan, Chuang, Siegel, Max, Zhang, Zhoutong, Wu, Jiajun et al. 2019. "A Computational Model for Combinatorial Generalization in Physical Perception from Sound." 2019 Conference on Cognitive Computational Neuroscience. en 10.32470/CCN.2019.1276-0 2019 Conference on Cognitive Computational Neuroscience Creative Commons Attribution 3.0 unported license https://creativecommons.org/licenses/by/3.0/ application/pdf Cognitive Computational Neuroscience Cognitive Computational Neuroscience
spellingShingle Wang, Yunyun
Gan, Chuang
Siegel, Max
Zhang, Zhoutong
Wu, Jiajun
Tenenbaum, Joshua
A Computational Model for Combinatorial Generalization in Physical Perception from Sound
title A Computational Model for Combinatorial Generalization in Physical Perception from Sound
title_full A Computational Model for Combinatorial Generalization in Physical Perception from Sound
title_fullStr A Computational Model for Combinatorial Generalization in Physical Perception from Sound
title_full_unstemmed A Computational Model for Combinatorial Generalization in Physical Perception from Sound
title_short A Computational Model for Combinatorial Generalization in Physical Perception from Sound
title_sort computational model for combinatorial generalization in physical perception from sound
url https://hdl.handle.net/1721.1/138340
work_keys_str_mv AT wangyunyun acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound
AT ganchuang acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound
AT siegelmax acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound
AT zhangzhoutong acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound
AT wujiajun acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound
AT tenenbaumjoshua acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound
AT wangyunyun computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound
AT ganchuang computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound
AT siegelmax computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound
AT zhangzhoutong computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound
AT wujiajun computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound
AT tenenbaumjoshua computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound