COTS: A Multipurpose RGB-D Dataset for Saliency and Image Manipulation Applications

Many modern computer vision systems include several modules that perform different processing operations packaged as a single pipeline architecture. This generally introduces a challenge in the evaluation process since most datasets provide evaluation data for just one of the operations. In this pap...

Full description

Bibliographic Details
Main Authors: Dylan Seychell, Carl James Debono, Mark Bugeja, Jeremy Borg, Matthew Sacco
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9340352/
Description
Summary:Many modern computer vision systems include several modules that perform different processing operations packaged as a single pipeline architecture. This generally introduces a challenge in the evaluation process since most datasets provide evaluation data for just one of the operations. In this paper, we present an RGB-D dataset that was designed from first principles to cater for applications that involve salient object detection, segmentation, inpainting and blending techniques. This addresses a gap in the evaluation of image inpainting and blending applications that generally rely on subjective evaluation due to the lack of availability of comparative data. A set of experiments were carried out to demonstrate how the COTS dataset can be used to evaluate these different applications. This dataset includes a variety of scenes, where each scene is captured multiple times, each time adding a new object to the previous scene. This allows for a comparative analysis at pixel level in image inpainting and blending applications. Moreover, all objects were manually labeled in order to offer the possibility of salient object detection even in scenes that contain multiple objects. An online test with 1267 participants was also carried out, and this dataset also includes the click coordinates of users' selection for every image, introducing a user interaction dimension in the same RGB-D dataset. This dataset was also validated using state of the art techniques, obtaining an $F_\beta $ of 0.957 in salient object detection and a mean (Intersection over Union) IoU of 0.942 in Segmentation. Results demonstrate that the COTS dataset introduces novel possibilities for the evaluation of computer vision applications.
ISSN:2169-3536