Multi-view fusion and machine learning in hand pose estimation from depth images
This project studies an approach to hand pose estimation that relies on convolutional neural network. In the recent years, hand pose estimation has been the subject of extensive research, with data-driven approaches appearing as the preferred method to perform the complex task of regressing a hand p...
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project (FYP) |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/74239 |
_version_ | 1811687639403724800 |
---|---|
author | Ong, Bee Lee |
author2 | Lin Feng |
author_facet | Lin Feng Ong, Bee Lee |
author_sort | Ong, Bee Lee |
collection | NTU |
description | This project studies an approach to hand pose estimation that relies on convolutional neural network. In the recent years, hand pose estimation has been the subject of extensive research, with data-driven approaches appearing as the preferred method to perform the complex task of regressing a hand pose to a depth image. In the end, this project will have developed an application consisting of two major parts: a frontend windowed application that can render a simple hand skeleton model given the locations of 21 landmark hand joints; and a backend which consists of the required processing logic and, most importantly, the convolutional neural network to drive the application’s intelligence. Inputs to the network comprise of a planar projection of a point cloud derived from the depth images obtained either from the Kinect device or from a hand gesture dataset. After forward propagation, the output of the network is a series of heatmaps encoding the likelihood of a particular hand joint being at a certain location. Despite having optimised much of the application, the neural network still requires some more fine-tuning of its hyperparameters due to the exploding gradient and dying ReLu problems. Future works can attempt to increase heatmap resolution for finer estimation results, as well as gradually cutting down the number of convolutional filters while preserving network accuracy to reduce redundant neurons and increase throughput. |
first_indexed | 2024-10-01T05:19:31Z |
format | Final Year Project (FYP) |
id | ntu-10356/74239 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T05:19:31Z |
publishDate | 2018 |
record_format | dspace |
spelling | ntu-10356/742392023-03-03T20:41:43Z Multi-view fusion and machine learning in hand pose estimation from depth images Ong, Bee Lee Lin Feng School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision This project studies an approach to hand pose estimation that relies on convolutional neural network. In the recent years, hand pose estimation has been the subject of extensive research, with data-driven approaches appearing as the preferred method to perform the complex task of regressing a hand pose to a depth image. In the end, this project will have developed an application consisting of two major parts: a frontend windowed application that can render a simple hand skeleton model given the locations of 21 landmark hand joints; and a backend which consists of the required processing logic and, most importantly, the convolutional neural network to drive the application’s intelligence. Inputs to the network comprise of a planar projection of a point cloud derived from the depth images obtained either from the Kinect device or from a hand gesture dataset. After forward propagation, the output of the network is a series of heatmaps encoding the likelihood of a particular hand joint being at a certain location. Despite having optimised much of the application, the neural network still requires some more fine-tuning of its hyperparameters due to the exploding gradient and dying ReLu problems. Future works can attempt to increase heatmap resolution for finer estimation results, as well as gradually cutting down the number of convolutional filters while preserving network accuracy to reduce redundant neurons and increase throughput. Bachelor of Engineering (Computer Science) 2018-05-14T03:12:14Z 2018-05-14T03:12:14Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74239 en Nanyang Technological University 28 p. application/pdf |
spellingShingle | DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Ong, Bee Lee Multi-view fusion and machine learning in hand pose estimation from depth images |
title | Multi-view fusion and machine learning in hand pose estimation from depth images |
title_full | Multi-view fusion and machine learning in hand pose estimation from depth images |
title_fullStr | Multi-view fusion and machine learning in hand pose estimation from depth images |
title_full_unstemmed | Multi-view fusion and machine learning in hand pose estimation from depth images |
title_short | Multi-view fusion and machine learning in hand pose estimation from depth images |
title_sort | multi view fusion and machine learning in hand pose estimation from depth images |
topic | DRNTU::Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision |
url | http://hdl.handle.net/10356/74239 |
work_keys_str_mv | AT ongbeelee multiviewfusionandmachinelearninginhandposeestimationfromdepthimages |