DynamicStereo: consistent dynamic depth from stereo videos
We consider the problem of reconstructing a dynamic scene observed from a stereo camera. Most existing methods for depth from stereo treat different stereo frames independently, leading to temporally inconsistent depth predictions. Temporal consistency is especially important for immersive AR or VR...
Main Authors: | , , , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
IEEE
2023
|
_version_ | 1826311047906066432 |
---|---|
author | Karaev, N Rocco, I Graham, B Neverova, N Vedaldi, A Rupprecht, C |
author_facet | Karaev, N Rocco, I Graham, B Neverova, N Vedaldi, A Rupprecht, C |
author_sort | Karaev, N |
collection | OXFORD |
description | We consider the problem of reconstructing a dynamic
scene observed from a stereo camera. Most existing methods for depth from stereo treat different stereo frames independently, leading to temporally inconsistent depth predictions. Temporal consistency is especially important for
immersive AR or VR scenarios, where flickering greatly diminishes the user experience. We propose DynamicStereo,
a novel transformer-based architecture to estimate disparity for stereo videos. The network learns to pool information from neighboring frames to improve the temporal consistency of its predictions. Our architecture is designed to
process stereo videos efficiently through divided attention
layers. We also introduce Dynamic Replica, a new benchmark dataset containing synthetic videos of people and animals in scanned environments, which provides complementary training and evaluation data for dynamic stereo closer
to real applications than existing datasets. Training with
this dataset further improves the quality of predictions of
our proposed DynamicStereo as well as prior methods. Finally, it acts as a benchmark for consistent stereo methods. |
first_indexed | 2024-03-07T08:02:36Z |
format | Conference item |
id | oxford-uuid:bf2b5113-251b-43fc-aeb3-0f139f5fe854 |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T08:02:36Z |
publishDate | 2023 |
publisher | IEEE |
record_format | dspace |
spelling | oxford-uuid:bf2b5113-251b-43fc-aeb3-0f139f5fe8542023-10-04T07:55:45ZDynamicStereo: consistent dynamic depth from stereo videosConference itemhttp://purl.org/coar/resource_type/c_5794uuid:bf2b5113-251b-43fc-aeb3-0f139f5fe854EnglishSymplectic ElementsIEEE2023Karaev, NRocco, IGraham, BNeverova, NVedaldi, ARupprecht, CWe consider the problem of reconstructing a dynamic scene observed from a stereo camera. Most existing methods for depth from stereo treat different stereo frames independently, leading to temporally inconsistent depth predictions. Temporal consistency is especially important for immersive AR or VR scenarios, where flickering greatly diminishes the user experience. We propose DynamicStereo, a novel transformer-based architecture to estimate disparity for stereo videos. The network learns to pool information from neighboring frames to improve the temporal consistency of its predictions. Our architecture is designed to process stereo videos efficiently through divided attention layers. We also introduce Dynamic Replica, a new benchmark dataset containing synthetic videos of people and animals in scanned environments, which provides complementary training and evaluation data for dynamic stereo closer to real applications than existing datasets. Training with this dataset further improves the quality of predictions of our proposed DynamicStereo as well as prior methods. Finally, it acts as a benchmark for consistent stereo methods. |
spellingShingle | Karaev, N Rocco, I Graham, B Neverova, N Vedaldi, A Rupprecht, C DynamicStereo: consistent dynamic depth from stereo videos |
title | DynamicStereo: consistent dynamic depth from stereo videos |
title_full | DynamicStereo: consistent dynamic depth from stereo videos |
title_fullStr | DynamicStereo: consistent dynamic depth from stereo videos |
title_full_unstemmed | DynamicStereo: consistent dynamic depth from stereo videos |
title_short | DynamicStereo: consistent dynamic depth from stereo videos |
title_sort | dynamicstereo consistent dynamic depth from stereo videos |
work_keys_str_mv | AT karaevn dynamicstereoconsistentdynamicdepthfromstereovideos AT roccoi dynamicstereoconsistentdynamicdepthfromstereovideos AT grahamb dynamicstereoconsistentdynamicdepthfromstereovideos AT neverovan dynamicstereoconsistentdynamicdepthfromstereovideos AT vedaldia dynamicstereoconsistentdynamicdepthfromstereovideos AT rupprechtc dynamicstereoconsistentdynamicdepthfromstereovideos |