ImageSpirit

Humans describe images in terms of nouns and adjectives while algorithms operate on images represented as sets of pixels. Bridging this gap between how humans would like to access images versus their typical representation is the goal of image parsing, which involves assigning object and attribute l...

全面介绍

书目详细资料
Main Authors: Cheng, M-M, Zheng, S, Lin, W-Y, Vineet, V, Sturgess, P, Crook, N, Mitra, NJ, Torr, P
格式: Journal article
语言:English
出版: Association for Computing Machinery 2014
_version_ 1826313275685470208
author Cheng, M-M
Zheng, S
Lin, W-Y
Vineet, V
Sturgess, P
Crook, N
Mitra, NJ
Torr, P
author_facet Cheng, M-M
Zheng, S
Lin, W-Y
Vineet, V
Sturgess, P
Crook, N
Mitra, NJ
Torr, P
author_sort Cheng, M-M
collection OXFORD
description Humans describe images in terms of nouns and adjectives while algorithms operate on images represented as sets of pixels. Bridging this gap between how humans would like to access images versus their typical representation is the goal of image parsing, which involves assigning object and attribute labels to pixels. In this article we propose treating nouns as object labels and adjectives as visual attribute labels. This allows us to formulate the image parsing problem as one of jointly estimating per-pixel object and attribute labels from a set of training images. We propose an efficient (interactive time) solution. Using the extracted labels as handles, our system empowers a user to verbally refine the results. This enables hands-free parsing of an image into pixel-wise object/attribute labels that correspond to human semantics. Verbally selecting objects of interest enables a novel and natural interaction modality that can possibly be used to interact with new generation devices (e.g., smartphones, Google Glass, livingroom devices). We demonstrate our system on a large number of real-world images with varying complexity. To help understand the trade-offs compared to traditional mouse-based interactions, results are reported for both a large-scale quantitative evaluation and a user study.
first_indexed 2024-09-25T04:10:30Z
format Journal article
id oxford-uuid:66b8c923-aa5f-4cfc-952f-b6637d88d87c
institution University of Oxford
language English
last_indexed 2024-09-25T04:10:30Z
publishDate 2014
publisher Association for Computing Machinery
record_format dspace
spelling oxford-uuid:66b8c923-aa5f-4cfc-952f-b6637d88d87c2024-06-24T15:17:29ZImageSpiritJournal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:66b8c923-aa5f-4cfc-952f-b6637d88d87cEnglishSymplectic ElementsAssociation for Computing Machinery2014Cheng, M-MZheng, SLin, W-YVineet, VSturgess, PCrook, NMitra, NJTorr, PHumans describe images in terms of nouns and adjectives while algorithms operate on images represented as sets of pixels. Bridging this gap between how humans would like to access images versus their typical representation is the goal of image parsing, which involves assigning object and attribute labels to pixels. In this article we propose treating nouns as object labels and adjectives as visual attribute labels. This allows us to formulate the image parsing problem as one of jointly estimating per-pixel object and attribute labels from a set of training images. We propose an efficient (interactive time) solution. Using the extracted labels as handles, our system empowers a user to verbally refine the results. This enables hands-free parsing of an image into pixel-wise object/attribute labels that correspond to human semantics. Verbally selecting objects of interest enables a novel and natural interaction modality that can possibly be used to interact with new generation devices (e.g., smartphones, Google Glass, livingroom devices). We demonstrate our system on a large number of real-world images with varying complexity. To help understand the trade-offs compared to traditional mouse-based interactions, results are reported for both a large-scale quantitative evaluation and a user study.
spellingShingle Cheng, M-M
Zheng, S
Lin, W-Y
Vineet, V
Sturgess, P
Crook, N
Mitra, NJ
Torr, P
ImageSpirit
title ImageSpirit
title_full ImageSpirit
title_fullStr ImageSpirit
title_full_unstemmed ImageSpirit
title_short ImageSpirit
title_sort imagespirit
work_keys_str_mv AT chengmm imagespirit
AT zhengs imagespirit
AT linwy imagespirit
AT vineetv imagespirit
AT sturgessp imagespirit
AT crookn imagespirit
AT mitranj imagespirit
AT torrp imagespirit