3D-aware instance segmentation and tracking in egocentric videos
<p>Egocentric videos present unique challenges for 3D scene understanding due to rapid camera motion, frequent object occlusions, and limited object visibility. This paper introduces a novel approach to instance segmentation and tracking in first-person video that leverages 3D awareness to ove...
Main Authors: | , , , , , |
---|---|
Format: | Conference item |
Language: | English |
Published: |
2025
|
_version_ | 1824458954920427520 |
---|---|
author | Bhalgat, Y Tschernezki, V Laina, I Henriques, JF Vedaldi, A Zisserman, A |
author_facet | Bhalgat, Y Tschernezki, V Laina, I Henriques, JF Vedaldi, A Zisserman, A |
author_sort | Bhalgat, Y |
collection | OXFORD |
description | <p>Egocentric videos present unique challenges for 3D scene understanding due to rapid camera motion, frequent object occlusions, and limited object visibility. This paper introduces a novel approach to instance segmentation and tracking in first-person video that leverages 3D awareness to overcome these obstacles. Our method integrates scene geometry, 3D object centroid tracking, and instance segmentation to create a robust framework for analyzing dynamic egocentric scenes. By incorporating spatial and temporal cues, we achieve superior performance compared to state-of-the-art 2D approaches. Extensive evaluations on the challenging EPIC Fields dataset demonstrate significant improvements across a range of tracking and segmentation consistency metrics. Specifically, our method outperforms the next best performing approach by 7 points in Association Accuracy (AssA) and 4.5 points in IDF1 score, while reducing the number of ID switches by 73% to 80% across various object categories. Leveraging our tracked instance segmentations, we showcase downstream applications in 3D object reconstruction and amodal video object segmentation in these egocentric settings.</p> |
first_indexed | 2025-02-19T04:34:06Z |
format | Conference item |
id | oxford-uuid:01610456-eaa8-4044-a4d0-e80ac5ab2452 |
institution | University of Oxford |
language | English |
last_indexed | 2025-02-19T04:34:06Z |
publishDate | 2025 |
record_format | dspace |
spelling | oxford-uuid:01610456-eaa8-4044-a4d0-e80ac5ab24522025-01-28T16:25:04Z3D-aware instance segmentation and tracking in egocentric videosConference itemhttp://purl.org/coar/resource_type/c_5794uuid:01610456-eaa8-4044-a4d0-e80ac5ab2452EnglishSymplectic Elements2025Bhalgat, YTschernezki, VLaina, IHenriques, JFVedaldi, AZisserman, A<p>Egocentric videos present unique challenges for 3D scene understanding due to rapid camera motion, frequent object occlusions, and limited object visibility. This paper introduces a novel approach to instance segmentation and tracking in first-person video that leverages 3D awareness to overcome these obstacles. Our method integrates scene geometry, 3D object centroid tracking, and instance segmentation to create a robust framework for analyzing dynamic egocentric scenes. By incorporating spatial and temporal cues, we achieve superior performance compared to state-of-the-art 2D approaches. Extensive evaluations on the challenging EPIC Fields dataset demonstrate significant improvements across a range of tracking and segmentation consistency metrics. Specifically, our method outperforms the next best performing approach by 7 points in Association Accuracy (AssA) and 4.5 points in IDF1 score, while reducing the number of ID switches by 73% to 80% across various object categories. Leveraging our tracked instance segmentations, we showcase downstream applications in 3D object reconstruction and amodal video object segmentation in these egocentric settings.</p> |
spellingShingle | Bhalgat, Y Tschernezki, V Laina, I Henriques, JF Vedaldi, A Zisserman, A 3D-aware instance segmentation and tracking in egocentric videos |
title | 3D-aware instance segmentation and tracking in egocentric videos |
title_full | 3D-aware instance segmentation and tracking in egocentric videos |
title_fullStr | 3D-aware instance segmentation and tracking in egocentric videos |
title_full_unstemmed | 3D-aware instance segmentation and tracking in egocentric videos |
title_short | 3D-aware instance segmentation and tracking in egocentric videos |
title_sort | 3d aware instance segmentation and tracking in egocentric videos |
work_keys_str_mv | AT bhalgaty 3dawareinstancesegmentationandtrackinginegocentricvideos AT tschernezkiv 3dawareinstancesegmentationandtrackinginegocentricvideos AT lainai 3dawareinstancesegmentationandtrackinginegocentricvideos AT henriquesjf 3dawareinstancesegmentationandtrackinginegocentricvideos AT vedaldia 3dawareinstancesegmentationandtrackinginegocentricvideos AT zissermana 3dawareinstancesegmentationandtrackinginegocentricvideos |