Pose Mask: A Model-Based Augmentation Method for 2D Pose Estimation in Classroom Scenes Using Surveillance Images

Solid developments have been seen in deep-learning-based pose estimation, but few works have explored performance in dense crowds, such as a classroom scene; furthermore, no specific knowledge is considered in the design of image augmentation for pose estimation. A masked autoencoder was shown to ha...

Full description

Bibliographic Details
Main Authors: Shichang Liu, Miao Ma, Haiyang Li, Hanyang Ning, Min Wang
Format: Article
Language:English
Published: MDPI AG 2022-10-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/22/21/8331
_version_ 1797466510481948672
author Shichang Liu
Miao Ma
Haiyang Li
Hanyang Ning
Min Wang
author_facet Shichang Liu
Miao Ma
Haiyang Li
Hanyang Ning
Min Wang
author_sort Shichang Liu
collection DOAJ
description Solid developments have been seen in deep-learning-based pose estimation, but few works have explored performance in dense crowds, such as a classroom scene; furthermore, no specific knowledge is considered in the design of image augmentation for pose estimation. A masked autoencoder was shown to have a non-negligible capability in image reconstruction, where the masking mechanism that randomly drops patches forces the model to build unknown pixels from known pixels. Inspired by this self-supervised learning method, where the restoration of the feature loss induced by the mask is consistent with tackling the occlusion problem in classroom scenarios, we discovered that the transfer performance of the pre-trained weights could be used as a model-based augmentation to overcome the intractable occlusion in classroom pose estimation. In this study, we proposed a top-down pose estimation method that utilized the natural reconstruction capability of missing information of the MAE as an effective occluded image augmentation in a pose estimation task. The difference with the original MAE was that instead of using a 75% random mask ratio, we regarded the keypoint distribution probabilistic heatmap as a reference for masking, which we named Pose Mask. To test the performance of our method in heavily occluded classroom scenes, we collected a new dataset for pose estimation in classroom scenes named Class Pose and conducted many experiments, the results of which showed promising performance.
first_indexed 2024-03-09T18:39:51Z
format Article
id doaj.art-24446f533798478f9081a70f9a3bbe9d
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-09T18:39:51Z
publishDate 2022-10-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-24446f533798478f9081a70f9a3bbe9d2023-11-24T06:46:21ZengMDPI AGSensors1424-82202022-10-012221833110.3390/s22218331Pose Mask: A Model-Based Augmentation Method for 2D Pose Estimation in Classroom Scenes Using Surveillance ImagesShichang Liu0Miao Ma1Haiyang Li2Hanyang Ning3Min Wang4School of Computer Science, Shaanxi Normal University, Xi’an 710119, ChinaSchool of Computer Science, Shaanxi Normal University, Xi’an 710119, ChinaSchool of Computer Science, Shaanxi Normal University, Xi’an 710119, ChinaSchool of Computer Science, Shaanxi Normal University, Xi’an 710119, ChinaNational Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, Xi’an 710072, ChinaSolid developments have been seen in deep-learning-based pose estimation, but few works have explored performance in dense crowds, such as a classroom scene; furthermore, no specific knowledge is considered in the design of image augmentation for pose estimation. A masked autoencoder was shown to have a non-negligible capability in image reconstruction, where the masking mechanism that randomly drops patches forces the model to build unknown pixels from known pixels. Inspired by this self-supervised learning method, where the restoration of the feature loss induced by the mask is consistent with tackling the occlusion problem in classroom scenarios, we discovered that the transfer performance of the pre-trained weights could be used as a model-based augmentation to overcome the intractable occlusion in classroom pose estimation. In this study, we proposed a top-down pose estimation method that utilized the natural reconstruction capability of missing information of the MAE as an effective occluded image augmentation in a pose estimation task. The difference with the original MAE was that instead of using a 75% random mask ratio, we regarded the keypoint distribution probabilistic heatmap as a reference for masking, which we named Pose Mask. To test the performance of our method in heavily occluded classroom scenes, we collected a new dataset for pose estimation in classroom scenes named Class Pose and conducted many experiments, the results of which showed promising performance.https://www.mdpi.com/1424-8220/22/21/8331pose estimationmasked autoencodermodel-based augmentationclassroom scenes
spellingShingle Shichang Liu
Miao Ma
Haiyang Li
Hanyang Ning
Min Wang
Pose Mask: A Model-Based Augmentation Method for 2D Pose Estimation in Classroom Scenes Using Surveillance Images
Sensors
pose estimation
masked autoencoder
model-based augmentation
classroom scenes
title Pose Mask: A Model-Based Augmentation Method for 2D Pose Estimation in Classroom Scenes Using Surveillance Images
title_full Pose Mask: A Model-Based Augmentation Method for 2D Pose Estimation in Classroom Scenes Using Surveillance Images
title_fullStr Pose Mask: A Model-Based Augmentation Method for 2D Pose Estimation in Classroom Scenes Using Surveillance Images
title_full_unstemmed Pose Mask: A Model-Based Augmentation Method for 2D Pose Estimation in Classroom Scenes Using Surveillance Images
title_short Pose Mask: A Model-Based Augmentation Method for 2D Pose Estimation in Classroom Scenes Using Surveillance Images
title_sort pose mask a model based augmentation method for 2d pose estimation in classroom scenes using surveillance images
topic pose estimation
masked autoencoder
model-based augmentation
classroom scenes
url https://www.mdpi.com/1424-8220/22/21/8331
work_keys_str_mv AT shichangliu posemaskamodelbasedaugmentationmethodfor2dposeestimationinclassroomscenesusingsurveillanceimages
AT miaoma posemaskamodelbasedaugmentationmethodfor2dposeestimationinclassroomscenesusingsurveillanceimages
AT haiyangli posemaskamodelbasedaugmentationmethodfor2dposeestimationinclassroomscenesusingsurveillanceimages
AT hanyangning posemaskamodelbasedaugmentationmethodfor2dposeestimationinclassroomscenesusingsurveillanceimages
AT minwang posemaskamodelbasedaugmentationmethodfor2dposeestimationinclassroomscenesusingsurveillanceimages