Occluded video instance segmentation: dataset and challenge

Although deep learning methods have achieved advanced video object recognition performance in recent years, perceiving heavily occluded objects in a video is still a very challenging task. To promote the development of occlusion understanding, we collect a large-scale dataset called OVIS for video i...

ver descrição completa

Detalhes bibliográficos
Principais autores:	Qi, J, Gao, Y, Hu, Y, Wang, X, Liu, X, Bai, X, Belongie, S, Yuille, A, Torr, P, Bai, S
Formato:	Conference item
Idioma:	English
Publicado em:	NeurIPS 2021

_version_	1826308231676297216
author	Qi, J Gao, Y Hu, Y Wang, X Liu, X Bai, X Belongie, S Yuille, A Torr, P Bai, S
author_facet	Qi, J Gao, Y Hu, Y Wang, X Liu, X Bai, X Belongie, S Yuille, A Torr, P Bai, S
author_sort	Qi, J
collection	OXFORD
description	Although deep learning methods have achieved advanced video object recognition performance in recent years, perceiving heavily occluded objects in a video is still a very challenging task. To promote the development of occlusion understanding, we collect a large-scale dataset called OVIS for video instance segmentation in the occluded scenario. OVIS consists of 296k high-quality instance masks and 901 occluded scenes. While our human vision systems can perceive those occluded objects by contextual reasoning and association, our experiments suggest that current video understanding systems cannot. On the OVIS dataset, all baseline methods encounter a significant performance degradation of about 80\% in the heavily occluded object group, which demonstrates that there is still a long way to go in understanding obscured objects and videos in a complex real-world scenario. To facilitate the research on new paradigms for video understanding systems, we launched a challenge basing on the OVIS dataset. The submitted top-performing algorithms have achieved much higher performance than our baselines. In this paper, we will introduce the OVIS dataset and further dissect it by analyzing the results of baselines and submitted methods. The OVIS dataset and challenge information can be found at \url{http://songbai.site/ovis}.
first_indexed	2024-03-07T07:14:55Z
format	Conference item
id	oxford-uuid:2c3b5112-b995-414b-a00d-1a4e4822874a
institution	University of Oxford
language	English
last_indexed	2024-03-07T07:14:55Z
publishDate	2021
publisher	NeurIPS
record_format	dspace
spelling	oxford-uuid:2c3b5112-b995-414b-a00d-1a4e4822874a2022-08-11T23:58:01ZOccluded video instance segmentation: dataset and challengeConference itemhttp://purl.org/coar/resource_type/c_5794uuid:2c3b5112-b995-414b-a00d-1a4e4822874aEnglishSymplectic ElementsNeurIPS2021Qi, JGao, YHu, YWang, XLiu, XBai, XBelongie, SYuille, ATorr, PBai, SAlthough deep learning methods have achieved advanced video object recognition performance in recent years, perceiving heavily occluded objects in a video is still a very challenging task. To promote the development of occlusion understanding, we collect a large-scale dataset called OVIS for video instance segmentation in the occluded scenario. OVIS consists of 296k high-quality instance masks and 901 occluded scenes. While our human vision systems can perceive those occluded objects by contextual reasoning and association, our experiments suggest that current video understanding systems cannot. On the OVIS dataset, all baseline methods encounter a significant performance degradation of about 80\% in the heavily occluded object group, which demonstrates that there is still a long way to go in understanding obscured objects and videos in a complex real-world scenario. To facilitate the research on new paradigms for video understanding systems, we launched a challenge basing on the OVIS dataset. The submitted top-performing algorithms have achieved much higher performance than our baselines. In this paper, we will introduce the OVIS dataset and further dissect it by analyzing the results of baselines and submitted methods. The OVIS dataset and challenge information can be found at \url{http://songbai.site/ovis}.
spellingShingle	Qi, J Gao, Y Hu, Y Wang, X Liu, X Bai, X Belongie, S Yuille, A Torr, P Bai, S Occluded video instance segmentation: dataset and challenge
title	Occluded video instance segmentation: dataset and challenge
title_full	Occluded video instance segmentation: dataset and challenge
title_fullStr	Occluded video instance segmentation: dataset and challenge
title_full_unstemmed	Occluded video instance segmentation: dataset and challenge
title_short	Occluded video instance segmentation: dataset and challenge
title_sort	occluded video instance segmentation dataset and challenge
work_keys_str_mv	AT qij occludedvideoinstancesegmentationdatasetandchallenge AT gaoy occludedvideoinstancesegmentationdatasetandchallenge AT huy occludedvideoinstancesegmentationdatasetandchallenge AT wangx occludedvideoinstancesegmentationdatasetandchallenge AT liux occludedvideoinstancesegmentationdatasetandchallenge AT baix occludedvideoinstancesegmentationdatasetandchallenge AT belongies occludedvideoinstancesegmentationdatasetandchallenge AT yuillea occludedvideoinstancesegmentationdatasetandchallenge AT torrp occludedvideoinstancesegmentationdatasetandchallenge AT bais occludedvideoinstancesegmentationdatasetandchallenge

Occluded video instance segmentation: dataset and challenge

Registros relacionados