Multi-view fusion and machine learning in hand pose estimation from depth images

This project studies an approach to hand pose estimation that relies on convolutional neural network. In the recent years, hand pose estimation has been the subject of extensive research, with data-driven approaches appearing as the preferred method to perform the complex task of regressing a hand p...

Full description

Bibliographic Details
Main Author: Ong, Bee Lee
Other Authors: Lin Feng
Format: Final Year Project (FYP)
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74239
_version_ 1811687639403724800
author Ong, Bee Lee
author2 Lin Feng
author_facet Lin Feng
Ong, Bee Lee
author_sort Ong, Bee Lee
collection NTU
description This project studies an approach to hand pose estimation that relies on convolutional neural network. In the recent years, hand pose estimation has been the subject of extensive research, with data-driven approaches appearing as the preferred method to perform the complex task of regressing a hand pose to a depth image. In the end, this project will have developed an application consisting of two major parts: a frontend windowed application that can render a simple hand skeleton model given the locations of 21 landmark hand joints; and a backend which consists of the required processing logic and, most importantly, the convolutional neural network to drive the application’s intelligence. Inputs to the network comprise of a planar projection of a point cloud derived from the depth images obtained either from the Kinect device or from a hand gesture dataset. After forward propagation, the output of the network is a series of heatmaps encoding the likelihood of a particular hand joint being at a certain location. Despite having optimised much of the application, the neural network still requires some more fine-tuning of its hyperparameters due to the exploding gradient and dying ReLu problems. Future works can attempt to increase heatmap resolution for finer estimation results, as well as gradually cutting down the number of convolutional filters while preserving network accuracy to reduce redundant neurons and increase throughput.
first_indexed 2024-10-01T05:19:31Z
format Final Year Project (FYP)
id ntu-10356/74239
institution Nanyang Technological University
language English
last_indexed 2024-10-01T05:19:31Z
publishDate 2018
record_format dspace
spelling ntu-10356/742392023-03-03T20:41:43Z Multi-view fusion and machine learning in hand pose estimation from depth images Ong, Bee Lee Lin Feng School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision This project studies an approach to hand pose estimation that relies on convolutional neural network. In the recent years, hand pose estimation has been the subject of extensive research, with data-driven approaches appearing as the preferred method to perform the complex task of regressing a hand pose to a depth image. In the end, this project will have developed an application consisting of two major parts: a frontend windowed application that can render a simple hand skeleton model given the locations of 21 landmark hand joints; and a backend which consists of the required processing logic and, most importantly, the convolutional neural network to drive the application’s intelligence. Inputs to the network comprise of a planar projection of a point cloud derived from the depth images obtained either from the Kinect device or from a hand gesture dataset. After forward propagation, the output of the network is a series of heatmaps encoding the likelihood of a particular hand joint being at a certain location. Despite having optimised much of the application, the neural network still requires some more fine-tuning of its hyperparameters due to the exploding gradient and dying ReLu problems. Future works can attempt to increase heatmap resolution for finer estimation results, as well as gradually cutting down the number of convolutional filters while preserving network accuracy to reduce redundant neurons and increase throughput. Bachelor of Engineering (Computer Science) 2018-05-14T03:12:14Z 2018-05-14T03:12:14Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74239 en Nanyang Technological University 28 p. application/pdf
spellingShingle DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Ong, Bee Lee
Multi-view fusion and machine learning in hand pose estimation from depth images
title Multi-view fusion and machine learning in hand pose estimation from depth images
title_full Multi-view fusion and machine learning in hand pose estimation from depth images
title_fullStr Multi-view fusion and machine learning in hand pose estimation from depth images
title_full_unstemmed Multi-view fusion and machine learning in hand pose estimation from depth images
title_short Multi-view fusion and machine learning in hand pose estimation from depth images
title_sort multi view fusion and machine learning in hand pose estimation from depth images
topic DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
url http://hdl.handle.net/10356/74239
work_keys_str_mv AT ongbeelee multiviewfusionandmachinelearninginhandposeestimationfromdepthimages