Lightweight Stacked Hourglass Network for Human Pose Estimation
Human pose estimation is a problem that continues to be one of the greatest challenges in the field of computer vision. While the stacked structure of an hourglass network has enabled substantial progress in human pose estimation and key-point detection areas, it is largely used as a backbone networ...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-09-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/10/18/6497 |
_version_ | 1797553400789860352 |
---|---|
author | Seung-Taek Kim Hyo Jong Lee |
author_facet | Seung-Taek Kim Hyo Jong Lee |
author_sort | Seung-Taek Kim |
collection | DOAJ |
description | Human pose estimation is a problem that continues to be one of the greatest challenges in the field of computer vision. While the stacked structure of an hourglass network has enabled substantial progress in human pose estimation and key-point detection areas, it is largely used as a backbone network. However, it also requires a relatively large number of parameters and high computational capacity due to the characteristics of its stacked structure. Accordingly, the present work proposes a more lightweight version of the hourglass network, which also improves the human pose estimation performance. The new hourglass network architecture utilizes several additional skip connections, which improve performance with minimal modifications while still maintaining the number of parameters in the network. Additionally, the size of the convolutional receptive field has a decisive effect in learning to detect features of the full human body. Therefore, we propose a multidilated light residual block, which expands the convolutional receptive field while also reducing the computational load. The proposed residual block is also invariant in scale when using multiple dilations. The well-known MPII and LSP human pose datasets were used to evaluate the performance using the proposed method. A variety of experiments were conducted that confirm that our method is more efficient compared to current state-of-the-art hourglass weight-reduction methods. |
first_indexed | 2024-03-10T16:14:49Z |
format | Article |
id | doaj.art-b453cd2e8af343bfa526efda249ca29c |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-10T16:14:49Z |
publishDate | 2020-09-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-b453cd2e8af343bfa526efda249ca29c2023-11-20T14:07:43ZengMDPI AGApplied Sciences2076-34172020-09-011018649710.3390/app10186497Lightweight Stacked Hourglass Network for Human Pose EstimationSeung-Taek Kim0Hyo Jong Lee1Division of Computer Science and Engineering, Jeonbuk National University, Jeonju 54896, KoreaDivision of Computer Science and Engineering, Jeonbuk National University, Jeonju 54896, KoreaHuman pose estimation is a problem that continues to be one of the greatest challenges in the field of computer vision. While the stacked structure of an hourglass network has enabled substantial progress in human pose estimation and key-point detection areas, it is largely used as a backbone network. However, it also requires a relatively large number of parameters and high computational capacity due to the characteristics of its stacked structure. Accordingly, the present work proposes a more lightweight version of the hourglass network, which also improves the human pose estimation performance. The new hourglass network architecture utilizes several additional skip connections, which improve performance with minimal modifications while still maintaining the number of parameters in the network. Additionally, the size of the convolutional receptive field has a decisive effect in learning to detect features of the full human body. Therefore, we propose a multidilated light residual block, which expands the convolutional receptive field while also reducing the computational load. The proposed residual block is also invariant in scale when using multiple dilations. The well-known MPII and LSP human pose datasets were used to evaluate the performance using the proposed method. A variety of experiments were conducted that confirm that our method is more efficient compared to current state-of-the-art hourglass weight-reduction methods.https://www.mdpi.com/2076-3417/10/18/6497pose estimationstacked hourglass networkdeep learningconvolutional receptive field |
spellingShingle | Seung-Taek Kim Hyo Jong Lee Lightweight Stacked Hourglass Network for Human Pose Estimation Applied Sciences pose estimation stacked hourglass network deep learning convolutional receptive field |
title | Lightweight Stacked Hourglass Network for Human Pose Estimation |
title_full | Lightweight Stacked Hourglass Network for Human Pose Estimation |
title_fullStr | Lightweight Stacked Hourglass Network for Human Pose Estimation |
title_full_unstemmed | Lightweight Stacked Hourglass Network for Human Pose Estimation |
title_short | Lightweight Stacked Hourglass Network for Human Pose Estimation |
title_sort | lightweight stacked hourglass network for human pose estimation |
topic | pose estimation stacked hourglass network deep learning convolutional receptive field |
url | https://www.mdpi.com/2076-3417/10/18/6497 |
work_keys_str_mv | AT seungtaekkim lightweightstackedhourglassnetworkforhumanposeestimation AT hyojonglee lightweightstackedhourglassnetworkforhumanposeestimation |