A Computational Model for Combinatorial Generalization in Physical Perception from Sound
Humans possess the unique ability of combinatorial generalization in auditory perception: given novel auditory stimuli, humans perform auditory scene analysis and infer causal physical interactions based on prior knowledge. Could we build a computational model that achieves human-like combinatorial...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Cognitive Computational Neuroscience
2021
|
Online Access: | https://hdl.handle.net/1721.1/138340 |
_version_ | 1826191424319651840 |
---|---|
author | Wang, Yunyun Gan, Chuang Siegel, Max Zhang, Zhoutong Wu, Jiajun Tenenbaum, Joshua |
author_facet | Wang, Yunyun Gan, Chuang Siegel, Max Zhang, Zhoutong Wu, Jiajun Tenenbaum, Joshua |
author_sort | Wang, Yunyun |
collection | MIT |
description | Humans possess the unique ability of combinatorial generalization in auditory perception: given novel auditory stimuli, humans perform auditory scene analysis and infer causal physical interactions based on prior knowledge. Could we build a computational model that achieves human-like combinatorial generalization? In this paper, we present a case study on box-shaking: having heard only the sound of a single ball moving in a box, we seek to interpret the sound of two or three balls of different materials. To solve this task, we propose a hybrid model with two components: a neural network for perception, and a physical audio engine for simulation. We use the outcome of the network as an initial guess and perform MCMC sampling with the audio engine to improve the result. Combining neural networks with a physical audio engine, our hybrid model achieves combinatorial generalization efficiently and accurately in auditory scene perception. |
first_indexed | 2024-09-23T08:55:59Z |
format | Article |
id | mit-1721.1/138340 |
institution | Massachusetts Institute of Technology |
language | English |
last_indexed | 2024-09-23T08:55:59Z |
publishDate | 2021 |
publisher | Cognitive Computational Neuroscience |
record_format | dspace |
spelling | mit-1721.1/1383402021-12-08T03:35:30Z A Computational Model for Combinatorial Generalization in Physical Perception from Sound Wang, Yunyun Gan, Chuang Siegel, Max Zhang, Zhoutong Wu, Jiajun Tenenbaum, Joshua Humans possess the unique ability of combinatorial generalization in auditory perception: given novel auditory stimuli, humans perform auditory scene analysis and infer causal physical interactions based on prior knowledge. Could we build a computational model that achieves human-like combinatorial generalization? In this paper, we present a case study on box-shaking: having heard only the sound of a single ball moving in a box, we seek to interpret the sound of two or three balls of different materials. To solve this task, we propose a hybrid model with two components: a neural network for perception, and a physical audio engine for simulation. We use the outcome of the network as an initial guess and perform MCMC sampling with the audio engine to improve the result. Combining neural networks with a physical audio engine, our hybrid model achieves combinatorial generalization efficiently and accurately in auditory scene perception. 2021-12-07T13:44:38Z 2021-12-07T13:44:38Z 2019 2021-12-07T13:39:22Z Article http://purl.org/eprint/type/ConferencePaper https://hdl.handle.net/1721.1/138340 Wang, Yunyun, Gan, Chuang, Siegel, Max, Zhang, Zhoutong, Wu, Jiajun et al. 2019. "A Computational Model for Combinatorial Generalization in Physical Perception from Sound." 2019 Conference on Cognitive Computational Neuroscience. en 10.32470/CCN.2019.1276-0 2019 Conference on Cognitive Computational Neuroscience Creative Commons Attribution 3.0 unported license https://creativecommons.org/licenses/by/3.0/ application/pdf Cognitive Computational Neuroscience Cognitive Computational Neuroscience |
spellingShingle | Wang, Yunyun Gan, Chuang Siegel, Max Zhang, Zhoutong Wu, Jiajun Tenenbaum, Joshua A Computational Model for Combinatorial Generalization in Physical Perception from Sound |
title | A Computational Model for Combinatorial Generalization in Physical Perception from Sound |
title_full | A Computational Model for Combinatorial Generalization in Physical Perception from Sound |
title_fullStr | A Computational Model for Combinatorial Generalization in Physical Perception from Sound |
title_full_unstemmed | A Computational Model for Combinatorial Generalization in Physical Perception from Sound |
title_short | A Computational Model for Combinatorial Generalization in Physical Perception from Sound |
title_sort | computational model for combinatorial generalization in physical perception from sound |
url | https://hdl.handle.net/1721.1/138340 |
work_keys_str_mv | AT wangyunyun acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound AT ganchuang acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound AT siegelmax acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound AT zhangzhoutong acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound AT wujiajun acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound AT tenenbaumjoshua acomputationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound AT wangyunyun computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound AT ganchuang computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound AT siegelmax computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound AT zhangzhoutong computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound AT wujiajun computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound AT tenenbaumjoshua computationalmodelforcombinatorialgeneralizationinphysicalperceptionfromsound |