Joint object-material category segmentation from audio-visual cues

It is not always possible to recognise objects and infer material properties for a scene from visual cues alone, since objects can look visually similar whilst being made of very different materials. In this paper, we therefore present an approach that augments the available dense visual cues with s...

全面介紹

書目詳細資料
Main Authors:	Arnab, A, Sapienza, M, Golodetz, S, Valentin, J, Miksik, O, Izadi, S, Torr, P
格式:	Conference item
出版:	BMVA Press 2015

_version_	1826259771822440448
author	Arnab, A Sapienza, M Golodetz, S Valentin, J Miksik, O Izadi, S Torr, P
author_facet	Arnab, A Sapienza, M Golodetz, S Valentin, J Miksik, O Izadi, S Torr, P
author_sort	Arnab, A
collection	OXFORD
description	It is not always possible to recognise objects and infer material properties for a scene from visual cues alone, since objects can look visually similar whilst being made of very different materials. In this paper, we therefore present an approach that augments the available dense visual cues with sparse auditory cues in order to estimate dense object and material labels. Since estimates of object class and material properties are mutually-informative, we optimise our multi-output labelling jointly using a random-field framework. We evaluate our system on a new dataset with paired visual and auditory data that we make publicly available. We demonstrate that this joint estimation of object and material labels significantly outperforms the estimation of either category in isolation.
first_indexed	2024-03-06T18:55:03Z
format	Conference item
id	oxford-uuid:118d2199-a58e-45d3-9516-8c222ecd23e3
institution	University of Oxford
last_indexed	2024-03-06T18:55:03Z
publishDate	2015
publisher	BMVA Press
record_format	dspace
spelling	oxford-uuid:118d2199-a58e-45d3-9516-8c222ecd23e32022-03-26T10:02:57ZJoint object-material category segmentation from audio-visual cuesConference itemhttp://purl.org/coar/resource_type/c_5794uuid:118d2199-a58e-45d3-9516-8c222ecd23e3Symplectic Elements at OxfordBMVA Press2015Arnab, ASapienza, MGolodetz, SValentin, JMiksik, OIzadi, STorr, PIt is not always possible to recognise objects and infer material properties for a scene from visual cues alone, since objects can look visually similar whilst being made of very different materials. In this paper, we therefore present an approach that augments the available dense visual cues with sparse auditory cues in order to estimate dense object and material labels. Since estimates of object class and material properties are mutually-informative, we optimise our multi-output labelling jointly using a random-field framework. We evaluate our system on a new dataset with paired visual and auditory data that we make publicly available. We demonstrate that this joint estimation of object and material labels significantly outperforms the estimation of either category in isolation.
spellingShingle	Arnab, A Sapienza, M Golodetz, S Valentin, J Miksik, O Izadi, S Torr, P Joint object-material category segmentation from audio-visual cues
title	Joint object-material category segmentation from audio-visual cues
title_full	Joint object-material category segmentation from audio-visual cues
title_fullStr	Joint object-material category segmentation from audio-visual cues
title_full_unstemmed	Joint object-material category segmentation from audio-visual cues
title_short	Joint object-material category segmentation from audio-visual cues
title_sort	joint object material category segmentation from audio visual cues
work_keys_str_mv	AT arnaba jointobjectmaterialcategorysegmentationfromaudiovisualcues AT sapienzam jointobjectmaterialcategorysegmentationfromaudiovisualcues AT golodetzs jointobjectmaterialcategorysegmentationfromaudiovisualcues AT valentinj jointobjectmaterialcategorysegmentationfromaudiovisualcues AT miksiko jointobjectmaterialcategorysegmentationfromaudiovisualcues AT izadis jointobjectmaterialcategorysegmentationfromaudiovisualcues AT torrp jointobjectmaterialcategorysegmentationfromaudiovisualcues

Joint object-material category segmentation from audio-visual cues

相似書籍