Active Vision in Binocular Depth Estimation: A Top-Down Perspective

Depth estimation is an ill-posed problem; objects of different shapes or dimensions, even if at different distances, may project to the same image on the retina. Our brain uses several cues for depth estimation, including monocular cues such as motion parallax and binocular cues such as diplopia. Ho...

Full description

Bibliographic Details
Main Authors:	Matteo Priorelli, Giovanni Pezzulo, Ivilin Peev Stoianov
Format:	Article
Language:	English
Published:	MDPI AG 2023-09-01
Series:	Biomimetics
Subjects:	active inference depth perception active vision predictive coding action-perception cycles
Online Access:	https://www.mdpi.com/2313-7673/8/5/445

_version_	1797581107379568640
author	Matteo Priorelli Giovanni Pezzulo Ivilin Peev Stoianov
author_facet	Matteo Priorelli Giovanni Pezzulo Ivilin Peev Stoianov
author_sort	Matteo Priorelli
collection	DOAJ
description	Depth estimation is an ill-posed problem; objects of different shapes or dimensions, even if at different distances, may project to the same image on the retina. Our brain uses several cues for depth estimation, including monocular cues such as motion parallax and binocular cues such as diplopia. However, it remains unclear how the computations required for depth estimation are implemented in biologically plausible ways. State-of-the-art approaches to depth estimation based on deep neural networks implicitly describe the brain as a hierarchical feature detector. Instead, in this paper we propose an alternative approach that casts depth estimation as a problem of active inference. We show that depth can be inferred by inverting a hierarchical generative model that simultaneously predicts the eyes’ projections from a 2D belief over an object. Model inversion consists of a series of biologically plausible homogeneous transformations based on Predictive Coding principles. Under the plausible assumption of a nonuniform fovea resolution, depth estimation favors an active vision strategy that fixates the object with the eyes, rendering the depth belief more accurate. This strategy is not realized by first fixating on a target and then estimating the depth; instead, it combines the two processes through action–perception cycles, with a similar mechanism of the saccades during object recognition. The proposed approach requires only local (top-down and bottom-up) message passing, which can be implemented in biologically plausible neural circuits.
first_indexed	2024-03-10T23:00:31Z
format	Article
id	doaj.art-9bdd95c1806f4bcc88032cfdc696392b
institution	Directory Open Access Journal
issn	2313-7673
language	English
last_indexed	2024-03-10T23:00:31Z
publishDate	2023-09-01
publisher	MDPI AG
record_format	Article
series	Biomimetics
spelling	doaj.art-9bdd95c1806f4bcc88032cfdc696392b2023-11-19T09:44:36ZengMDPI AGBiomimetics2313-76732023-09-018544510.3390/biomimetics8050445Active Vision in Binocular Depth Estimation: A Top-Down PerspectiveMatteo Priorelli0Giovanni Pezzulo1Ivilin Peev Stoianov2Institute of Cognitive Sciences and Technologies, National Research Council of Italy, 35137 Padova, ItalyInstitute of Cognitive Sciences and Technologies, National Research Council of Italy, 00185 Rome, ItalyInstitute of Cognitive Sciences and Technologies, National Research Council of Italy, 35137 Padova, ItalyDepth estimation is an ill-posed problem; objects of different shapes or dimensions, even if at different distances, may project to the same image on the retina. Our brain uses several cues for depth estimation, including monocular cues such as motion parallax and binocular cues such as diplopia. However, it remains unclear how the computations required for depth estimation are implemented in biologically plausible ways. State-of-the-art approaches to depth estimation based on deep neural networks implicitly describe the brain as a hierarchical feature detector. Instead, in this paper we propose an alternative approach that casts depth estimation as a problem of active inference. We show that depth can be inferred by inverting a hierarchical generative model that simultaneously predicts the eyes’ projections from a 2D belief over an object. Model inversion consists of a series of biologically plausible homogeneous transformations based on Predictive Coding principles. Under the plausible assumption of a nonuniform fovea resolution, depth estimation favors an active vision strategy that fixates the object with the eyes, rendering the depth belief more accurate. This strategy is not realized by first fixating on a target and then estimating the depth; instead, it combines the two processes through action–perception cycles, with a similar mechanism of the saccades during object recognition. The proposed approach requires only local (top-down and bottom-up) message passing, which can be implemented in biologically plausible neural circuits.https://www.mdpi.com/2313-7673/8/5/445active inferencedepth perceptionactive visionpredictive codingaction-perception cycles
spellingShingle	Matteo Priorelli Giovanni Pezzulo Ivilin Peev Stoianov Active Vision in Binocular Depth Estimation: A Top-Down Perspective Biomimetics active inference depth perception active vision predictive coding action-perception cycles
title	Active Vision in Binocular Depth Estimation: A Top-Down Perspective
title_full	Active Vision in Binocular Depth Estimation: A Top-Down Perspective
title_fullStr	Active Vision in Binocular Depth Estimation: A Top-Down Perspective
title_full_unstemmed	Active Vision in Binocular Depth Estimation: A Top-Down Perspective
title_short	Active Vision in Binocular Depth Estimation: A Top-Down Perspective
title_sort	active vision in binocular depth estimation a top down perspective
topic	active inference depth perception active vision predictive coding action-perception cycles
url	https://www.mdpi.com/2313-7673/8/5/445
work_keys_str_mv	AT matteopriorelli activevisioninbinoculardepthestimationatopdownperspective AT giovannipezzulo activevisioninbinoculardepthestimationatopdownperspective AT ivilinpeevstoianov activevisioninbinoculardepthestimationatopdownperspective

Active Vision in Binocular Depth Estimation: A Top-Down Perspective

Similar Items