Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

We describe a model of object recognition as machine translation. In this model, recognition is a process of annotating image regions with words. Firstly, images are segmented into regions, which are classified into region types using a variety of features. A mapping between region types and keyword...

Full description

Bibliographic Details
Main Authors:	Duygulu, P, Barnard, K, Freitas, J, Forsyth, D
Format:	Conference item
Published:	Springer Berlin Heidelberg 2002

_version_	1826285437096820736
author	Duygulu, P Barnard, K Freitas, J Forsyth, D
author_facet	Duygulu, P Barnard, K Freitas, J Forsyth, D
author_sort	Duygulu, P
collection	OXFORD
description	We describe a model of object recognition as machine translation. In this model, recognition is a process of annotating image regions with words. Firstly, images are segmented into regions, which are classified into region types using a variety of features. A mapping between region types and keywords supplied with the images, is then learned, using a method based around EM. This process is analogous with learning a lexicon from an aligned bitext. For the implementation we describe, these words are nouns taken from a large vocabulary. On a large test set, the method can predict numerous words with high accuracy. Simple methods identify words that cannot be predicted well. We show how to cluster words that individually are difficult to predict into clusters that can be predicted well — for example, we cannot predict the distinction between train and locomotive using the current set of features, but we can predict the underlying concept. The method is trained on a substantial collection of images. Extensive experimental results illustrate the strengths and weaknesses of the approach.
first_indexed	2024-03-07T01:28:49Z
format	Conference item
id	oxford-uuid:92e538e8-10b7-461a-9f8e-db525ab8fdf4
institution	University of Oxford
last_indexed	2024-03-07T01:28:49Z
publishDate	2002
publisher	Springer Berlin Heidelberg
record_format	dspace
spelling	oxford-uuid:92e538e8-10b7-461a-9f8e-db525ab8fdf42022-03-26T23:28:39ZObject Recognition as Machine Translation: Learning a Lexicon for a Fixed Image VocabularyConference itemhttp://purl.org/coar/resource_type/c_5794uuid:92e538e8-10b7-461a-9f8e-db525ab8fdf4Department of Computer ScienceSpringer Berlin Heidelberg2002Duygulu, PBarnard, KFreitas, JForsyth, DWe describe a model of object recognition as machine translation. In this model, recognition is a process of annotating image regions with words. Firstly, images are segmented into regions, which are classified into region types using a variety of features. A mapping between region types and keywords supplied with the images, is then learned, using a method based around EM. This process is analogous with learning a lexicon from an aligned bitext. For the implementation we describe, these words are nouns taken from a large vocabulary. On a large test set, the method can predict numerous words with high accuracy. Simple methods identify words that cannot be predicted well. We show how to cluster words that individually are difficult to predict into clusters that can be predicted well — for example, we cannot predict the distinction between train and locomotive using the current set of features, but we can predict the underlying concept. The method is trained on a substantial collection of images. Extensive experimental results illustrate the strengths and weaknesses of the approach.
spellingShingle	Duygulu, P Barnard, K Freitas, J Forsyth, D Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
title	Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
title_full	Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
title_fullStr	Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
title_full_unstemmed	Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
title_short	Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary
title_sort	object recognition as machine translation learning a lexicon for a fixed image vocabulary
work_keys_str_mv	AT duygulup objectrecognitionasmachinetranslationlearningalexiconforafixedimagevocabulary AT barnardk objectrecognitionasmachinetranslationlearningalexiconforafixedimagevocabulary AT freitasj objectrecognitionasmachinetranslationlearningalexiconforafixedimagevocabulary AT forsythd objectrecognitionasmachinetranslationlearningalexiconforafixedimagevocabulary

Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary

Similar Items