Multi-view fusion and machine learning in hand pose estimation from depth images

This project studies an approach to hand pose estimation that relies on convolutional neural network. In the recent years, hand pose estimation has been the subject of extensive research, with data-driven approaches appearing as the preferred method to perform the complex task of regressing a hand p...

Full description

Bibliographic Details
Main Author:	Ong, Bee Lee
Other Authors:	Lin Feng
Format:	Final Year Project (FYP)
Language:	English
Published:	2018
Subjects:	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Online Access:	http://hdl.handle.net/10356/74239

_version_	1811687639403724800
author	Ong, Bee Lee
author2	Lin Feng
author_facet	Lin Feng Ong, Bee Lee
author_sort	Ong, Bee Lee
collection	NTU
description	This project studies an approach to hand pose estimation that relies on convolutional neural network. In the recent years, hand pose estimation has been the subject of extensive research, with data-driven approaches appearing as the preferred method to perform the complex task of regressing a hand pose to a depth image. In the end, this project will have developed an application consisting of two major parts: a frontend windowed application that can render a simple hand skeleton model given the locations of 21 landmark hand joints; and a backend which consists of the required processing logic and, most importantly, the convolutional neural network to drive the application’s intelligence. Inputs to the network comprise of a planar projection of a point cloud derived from the depth images obtained either from the Kinect device or from a hand gesture dataset. After forward propagation, the output of the network is a series of heatmaps encoding the likelihood of a particular hand joint being at a certain location. Despite having optimised much of the application, the neural network still requires some more fine-tuning of its hyperparameters due to the exploding gradient and dying ReLu problems. Future works can attempt to increase heatmap resolution for finer estimation results, as well as gradually cutting down the number of convolutional filters while preserving network accuracy to reduce redundant neurons and increase throughput.
first_indexed	2024-10-01T05:19:31Z
format	Final Year Project (FYP)
id	ntu-10356/74239
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T05:19:31Z
publishDate	2018
record_format	dspace
spelling	ntu-10356/742392023-03-03T20:41:43Z Multi-view fusion and machine learning in hand pose estimation from depth images Ong, Bee Lee Lin Feng School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision This project studies an approach to hand pose estimation that relies on convolutional neural network. In the recent years, hand pose estimation has been the subject of extensive research, with data-driven approaches appearing as the preferred method to perform the complex task of regressing a hand pose to a depth image. In the end, this project will have developed an application consisting of two major parts: a frontend windowed application that can render a simple hand skeleton model given the locations of 21 landmark hand joints; and a backend which consists of the required processing logic and, most importantly, the convolutional neural network to drive the application’s intelligence. Inputs to the network comprise of a planar projection of a point cloud derived from the depth images obtained either from the Kinect device or from a hand gesture dataset. After forward propagation, the output of the network is a series of heatmaps encoding the likelihood of a particular hand joint being at a certain location. Despite having optimised much of the application, the neural network still requires some more fine-tuning of its hyperparameters due to the exploding gradient and dying ReLu problems. Future works can attempt to increase heatmap resolution for finer estimation results, as well as gradually cutting down the number of convolutional filters while preserving network accuracy to reduce redundant neurons and increase throughput. Bachelor of Engineering (Computer Science) 2018-05-14T03:12:14Z 2018-05-14T03:12:14Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74239 en Nanyang Technological University 28 p. application/pdf
spellingShingle	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Ong, Bee Lee Multi-view fusion and machine learning in hand pose estimation from depth images
title	Multi-view fusion and machine learning in hand pose estimation from depth images
title_full	Multi-view fusion and machine learning in hand pose estimation from depth images
title_fullStr	Multi-view fusion and machine learning in hand pose estimation from depth images
title_full_unstemmed	Multi-view fusion and machine learning in hand pose estimation from depth images
title_short	Multi-view fusion and machine learning in hand pose estimation from depth images
title_sort	multi view fusion and machine learning in hand pose estimation from depth images
topic	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
url	http://hdl.handle.net/10356/74239
work_keys_str_mv	AT ongbeelee multiviewfusionandmachinelearninginhandposeestimationfromdepthimages

Multi-view fusion and machine learning in hand pose estimation from depth images

Similar Items