Developing object perception in the low data regime

<p>Objects are central to human perception and understanding of the world. There is an abundance of images available on the internet covering the vast number of objects in the world, however, labelling these images exhaustively to cover all objects is infeasible—limiting the utility of systems...

Full description

Bibliographic Details
Main Author:	Kaul, P
Other Authors:	Zisserman, A
Format:	Thesis
Language:	English
Published:	2024
Subjects:	Computer vision Deep learning (Machine learning)

_version_	1797113335548739584
author	Kaul, P
author2	Zisserman, A
author_facet	Zisserman, A Kaul, P
author_sort	Kaul, P
collection	OXFORD
description	<p>Objects are central to human perception and understanding of the world. There is an abundance of images available on the internet covering the vast number of objects in the world, however, labelling these images exhaustively to cover all objects is infeasible—limiting the utility of systems requiring strong supervision through large labelled datasets. To address this issue, this thesis develops methods to enable novel objects to be learnt with limited use of manually labelled data.</p> <p>First, we consider the problem of few-shot object detection, which is the problem of learning to expand the set of objects which can be detected with only a few manually labelled examples. We show that the few examples available for novel categories can be used to accurately pseudo-label existing data to yield a large number of novel pseudo-annotations for further detector training.</p> <p>Second, we address the more challenging problem of open-vocabulary object detection, which requires learning to detect novel object categories with no annotated data. We demonstrate the utility of detailed natural language descriptions to provide additional visual information for novel object detection. Moreover, we show that visual exemplars can be aggregated and combined with object descriptions to yield multi-modal classifiers for superior novel object detection.</p> <p>Finally, we consider the problem of object hallucinations in large vision-language models. We propose an automatic method to evaluate the presence of object hallucinations in detailed natural language descriptions of images generated by large vision-language models. We make use of language models and labelled detection data to automatically and robustly analyse the presence of object hallucinations in generated descriptions.</p>
first_indexed	2024-04-23T08:27:11Z
format	Thesis
id	oxford-uuid:bc157b41-02c1-403f-8a66-4489fea8494c
institution	University of Oxford
language	English
last_indexed	2024-04-23T08:27:11Z
publishDate	2024
record_format	dspace
spelling	oxford-uuid:bc157b41-02c1-403f-8a66-4489fea8494c2024-04-19T10:31:21ZDeveloping object perception in the low data regimeThesishttp://purl.org/coar/resource_type/c_db06uuid:bc157b41-02c1-403f-8a66-4489fea8494cComputer visionDeep learning (Machine learning)EnglishHyrax Deposit2024Kaul, PZisserman, AXie, WPhilipp, KTorr, P<p>Objects are central to human perception and understanding of the world. There is an abundance of images available on the internet covering the vast number of objects in the world, however, labelling these images exhaustively to cover all objects is infeasible—limiting the utility of systems requiring strong supervision through large labelled datasets. To address this issue, this thesis develops methods to enable novel objects to be learnt with limited use of manually labelled data.</p> <p>First, we consider the problem of few-shot object detection, which is the problem of learning to expand the set of objects which can be detected with only a few manually labelled examples. We show that the few examples available for novel categories can be used to accurately pseudo-label existing data to yield a large number of novel pseudo-annotations for further detector training.</p> <p>Second, we address the more challenging problem of open-vocabulary object detection, which requires learning to detect novel object categories with no annotated data. We demonstrate the utility of detailed natural language descriptions to provide additional visual information for novel object detection. Moreover, we show that visual exemplars can be aggregated and combined with object descriptions to yield multi-modal classifiers for superior novel object detection.</p> <p>Finally, we consider the problem of object hallucinations in large vision-language models. We propose an automatic method to evaluate the presence of object hallucinations in detailed natural language descriptions of images generated by large vision-language models. We make use of language models and labelled detection data to automatically and robustly analyse the presence of object hallucinations in generated descriptions.</p>
spellingShingle	Computer vision Deep learning (Machine learning) Kaul, P Developing object perception in the low data regime
title	Developing object perception in the low data regime
title_full	Developing object perception in the low data regime
title_fullStr	Developing object perception in the low data regime
title_full_unstemmed	Developing object perception in the low data regime
title_short	Developing object perception in the low data regime
title_sort	developing object perception in the low data regime
topic	Computer vision Deep learning (Machine learning)
work_keys_str_mv	AT kaulp developingobjectperceptioninthelowdataregime

Developing object perception in the low data regime

Similar Items