Looking deep at people: towards understanding and generating humans in images with deep learning

<p>Understanding and generating people in images and videos is a long-standing goal in computer vision. A significant effort has been devoted to these tasks by the research community along the last decades, greatly motivated by a large number of potential applications, like surveillance, human...

Full description

Bibliographic Details
Main Author: de Bem, RA
Other Authors: Torr, P
Format: Thesis
Language:English
Published: 2018
Subjects:
_version_ 1824459134941003776
author de Bem, RA
author2 Torr, P
author_facet Torr, P
de Bem, RA
author_sort de Bem, RA
collection OXFORD
description <p>Understanding and generating people in images and videos is a long-standing goal in computer vision. A significant effort has been devoted to these tasks by the research community along the last decades, greatly motivated by a large number of potential applications, like surveillance, human-machine interaction, action and behaviour recognition, motion capture, video reenactment, and computer graphics animation. Also driving the necessity of this remarkable endeavour, one can mention the difficulties for tackling such problems, generated for instance by the endless combinations of environments, visual appearances, and postures in which humans can appear in images. Besides that, the high-dimensionality of the human body, the inherent noise of visual data and the ill-posed characteristics of the problems are also relevant issues. Nonetheless, meaningful advances in the field were achieved recently using deep learning. </p> <p>This thesis pursues further advances towards understanding and generating people in visual data by the development of new discriminative and generative deep learning methods. The main contributions are: </p> <p>i) A deep learning framework for 2D human pose estimation, which allows for mean-field inference over part-based models; </p> <p>ii) A conditional deep generative model that achieves state-of-the-art results on generating images of humans conditioned on body posture; and </p> <p>iii) A structured semi-supervised deep generative model that jointly performs pose estimation and image generation, <em>understanding</em> and <em>generating</em> people in images in a single framework.</p>
first_indexed 2025-02-19T04:36:58Z
format Thesis
id oxford-uuid:d5d3b079-0f10-4371-b849-18cdb0292746
institution University of Oxford
language English
last_indexed 2025-02-19T04:36:58Z
publishDate 2018
record_format dspace
spelling oxford-uuid:d5d3b079-0f10-4371-b849-18cdb02927462025-02-03T06:31:52ZLooking deep at people: towards understanding and generating humans in images with deep learningThesishttp://purl.org/coar/resource_type/c_db06uuid:d5d3b079-0f10-4371-b849-18cdb0292746Computer visionHuman body analysisComputer engineeringDeep learningArtificial intelligenceMachine learningComputer scienceEnglishORA Deposit2018de Bem, RATorr, P<p>Understanding and generating people in images and videos is a long-standing goal in computer vision. A significant effort has been devoted to these tasks by the research community along the last decades, greatly motivated by a large number of potential applications, like surveillance, human-machine interaction, action and behaviour recognition, motion capture, video reenactment, and computer graphics animation. Also driving the necessity of this remarkable endeavour, one can mention the difficulties for tackling such problems, generated for instance by the endless combinations of environments, visual appearances, and postures in which humans can appear in images. Besides that, the high-dimensionality of the human body, the inherent noise of visual data and the ill-posed characteristics of the problems are also relevant issues. Nonetheless, meaningful advances in the field were achieved recently using deep learning. </p> <p>This thesis pursues further advances towards understanding and generating people in visual data by the development of new discriminative and generative deep learning methods. The main contributions are: </p> <p>i) A deep learning framework for 2D human pose estimation, which allows for mean-field inference over part-based models; </p> <p>ii) A conditional deep generative model that achieves state-of-the-art results on generating images of humans conditioned on body posture; and </p> <p>iii) A structured semi-supervised deep generative model that jointly performs pose estimation and image generation, <em>understanding</em> and <em>generating</em> people in images in a single framework.</p>
spellingShingle Computer vision
Human body analysis
Computer engineering
Deep learning
Artificial intelligence
Machine learning
Computer science
de Bem, RA
Looking deep at people: towards understanding and generating humans in images with deep learning
title Looking deep at people: towards understanding and generating humans in images with deep learning
title_full Looking deep at people: towards understanding and generating humans in images with deep learning
title_fullStr Looking deep at people: towards understanding and generating humans in images with deep learning
title_full_unstemmed Looking deep at people: towards understanding and generating humans in images with deep learning
title_short Looking deep at people: towards understanding and generating humans in images with deep learning
title_sort looking deep at people towards understanding and generating humans in images with deep learning
topic Computer vision
Human body analysis
Computer engineering
Deep learning
Artificial intelligence
Machine learning
Computer science
work_keys_str_mv AT debemra lookingdeepatpeopletowardsunderstandingandgeneratinghumansinimageswithdeeplearning