Lightweight Stacked Hourglass Network for Human Pose Estimation

Human pose estimation is a problem that continues to be one of the greatest challenges in the field of computer vision. While the stacked structure of an hourglass network has enabled substantial progress in human pose estimation and key-point detection areas, it is largely used as a backbone networ...

Full description

Bibliographic Details
Main Authors: Seung-Taek Kim, Hyo Jong Lee
Format: Article
Language:English
Published: MDPI AG 2020-09-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/10/18/6497
_version_ 1797553400789860352
author Seung-Taek Kim
Hyo Jong Lee
author_facet Seung-Taek Kim
Hyo Jong Lee
author_sort Seung-Taek Kim
collection DOAJ
description Human pose estimation is a problem that continues to be one of the greatest challenges in the field of computer vision. While the stacked structure of an hourglass network has enabled substantial progress in human pose estimation and key-point detection areas, it is largely used as a backbone network. However, it also requires a relatively large number of parameters and high computational capacity due to the characteristics of its stacked structure. Accordingly, the present work proposes a more lightweight version of the hourglass network, which also improves the human pose estimation performance. The new hourglass network architecture utilizes several additional skip connections, which improve performance with minimal modifications while still maintaining the number of parameters in the network. Additionally, the size of the convolutional receptive field has a decisive effect in learning to detect features of the full human body. Therefore, we propose a multidilated light residual block, which expands the convolutional receptive field while also reducing the computational load. The proposed residual block is also invariant in scale when using multiple dilations. The well-known MPII and LSP human pose datasets were used to evaluate the performance using the proposed method. A variety of experiments were conducted that confirm that our method is more efficient compared to current state-of-the-art hourglass weight-reduction methods.
first_indexed 2024-03-10T16:14:49Z
format Article
id doaj.art-b453cd2e8af343bfa526efda249ca29c
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-10T16:14:49Z
publishDate 2020-09-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-b453cd2e8af343bfa526efda249ca29c2023-11-20T14:07:43ZengMDPI AGApplied Sciences2076-34172020-09-011018649710.3390/app10186497Lightweight Stacked Hourglass Network for Human Pose EstimationSeung-Taek Kim0Hyo Jong Lee1Division of Computer Science and Engineering, Jeonbuk National University, Jeonju 54896, KoreaDivision of Computer Science and Engineering, Jeonbuk National University, Jeonju 54896, KoreaHuman pose estimation is a problem that continues to be one of the greatest challenges in the field of computer vision. While the stacked structure of an hourglass network has enabled substantial progress in human pose estimation and key-point detection areas, it is largely used as a backbone network. However, it also requires a relatively large number of parameters and high computational capacity due to the characteristics of its stacked structure. Accordingly, the present work proposes a more lightweight version of the hourglass network, which also improves the human pose estimation performance. The new hourglass network architecture utilizes several additional skip connections, which improve performance with minimal modifications while still maintaining the number of parameters in the network. Additionally, the size of the convolutional receptive field has a decisive effect in learning to detect features of the full human body. Therefore, we propose a multidilated light residual block, which expands the convolutional receptive field while also reducing the computational load. The proposed residual block is also invariant in scale when using multiple dilations. The well-known MPII and LSP human pose datasets were used to evaluate the performance using the proposed method. A variety of experiments were conducted that confirm that our method is more efficient compared to current state-of-the-art hourglass weight-reduction methods.https://www.mdpi.com/2076-3417/10/18/6497pose estimationstacked hourglass networkdeep learningconvolutional receptive field
spellingShingle Seung-Taek Kim
Hyo Jong Lee
Lightweight Stacked Hourglass Network for Human Pose Estimation
Applied Sciences
pose estimation
stacked hourglass network
deep learning
convolutional receptive field
title Lightweight Stacked Hourglass Network for Human Pose Estimation
title_full Lightweight Stacked Hourglass Network for Human Pose Estimation
title_fullStr Lightweight Stacked Hourglass Network for Human Pose Estimation
title_full_unstemmed Lightweight Stacked Hourglass Network for Human Pose Estimation
title_short Lightweight Stacked Hourglass Network for Human Pose Estimation
title_sort lightweight stacked hourglass network for human pose estimation
topic pose estimation
stacked hourglass network
deep learning
convolutional receptive field
url https://www.mdpi.com/2076-3417/10/18/6497
work_keys_str_mv AT seungtaekkim lightweightstackedhourglassnetworkforhumanposeestimation
AT hyojonglee lightweightstackedhourglassnetworkforhumanposeestimation