Study on quality evaluation and model testing of human motion dataset under manufacturing scenario

In the field of human trajectory prediction, most existing research focuses on urban roadways or indoor public spaces, often overlooking task-specific behaviors and interactions in industrial environments. To address this issue, our study utilized two datasets collected by Nanyang Technologica...

Full description

Bibliographic Details
Main Author: Zhang, Li
Other Authors: Su Rong
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181401
_version_ 1826114053584453632
author Zhang, Li
author2 Su Rong
author_facet Su Rong
Zhang, Li
author_sort Zhang, Li
collection NTU
description In the field of human trajectory prediction, most existing research focuses on urban roadways or indoor public spaces, often overlooking task-specific behaviors and interactions in industrial environments. To address this issue, our study utilized two datasets collected by Nanyang Technological University (NTU): the Fixed Detective Perspective (FDP) dataset and the First Person Perspective (FPP) dataset for human movement analysis in manufacturing environment. We extracted three features—pose, trajectory, and ego motion—from these datasets, which were used as inputs for a modified Convolutional Neural Network (CNN) and a modified Transformer model for human trajectory prediction. The experiments revealed that CNN is more suitable for tasks with strict training time requirements, while the Transformer model excels in tasks that demand higher accuracy. Moreover, experiments using the Transformer model with the optimal hyperparameter configuration, showed that the FDP-trained model achieved a Mean Absolute Error (MAE) of 84.1 pixels, compared to 158.9 pixels for the FPP-trained model, indicating that the FDP dataset, due to reduced self-motion noise, serves as a more suitable input. Furthermore, in scenarios where the image of human operators is incomplete due to occlusion, the Transformer model trained on sub-dataset where humans are occluded had an MAE of 180.7 pixels, while the model trained on the sub- dataset of human movement without occlusion had an MAE of 90.4 pixels, highlighting the challenges posed by occlusion in industrial environments. In the ablation study, different combinations of features—key points + pose, key points + ego motion, and key points + pose + ego motion—were used as inputs to the Transformer model. The results showed that the model trained with key points + pose achieved a Mean Absolute Error (MAE) of 11.82 pixels, the model trained with key points + ego motion had an MAE of 37.04 pixels, and the model trained with key points + pose + ego motion produced the lowest MAE of 10.79 pixels. All of these combinations significantly outperformed the model trained solely on trajectory, which had an MAE of 83.98 pixels. These results confirm that the inclusion of the pose feature plays a crucial role in improving the accuracy of the Transformer-based human trajectory prediction model, making it a key feature for enhancing predictive performance in industrial environments.
first_indexed 2025-03-09T11:01:11Z
format Thesis-Master by Coursework
id ntu-10356/181401
institution Nanyang Technological University
language English
last_indexed 2025-03-09T11:01:11Z
publishDate 2024
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1814012024-12-06T15:49:12Z Study on quality evaluation and model testing of human motion dataset under manufacturing scenario Zhang, Li Su Rong School of Electrical and Electronic Engineering RSu@ntu.edu.sg Engineering Human trajectory prediction Industrial environment Convolutional neural network (CNN) Transformer model Human-robot collaboration (HRC) In the field of human trajectory prediction, most existing research focuses on urban roadways or indoor public spaces, often overlooking task-specific behaviors and interactions in industrial environments. To address this issue, our study utilized two datasets collected by Nanyang Technological University (NTU): the Fixed Detective Perspective (FDP) dataset and the First Person Perspective (FPP) dataset for human movement analysis in manufacturing environment. We extracted three features—pose, trajectory, and ego motion—from these datasets, which were used as inputs for a modified Convolutional Neural Network (CNN) and a modified Transformer model for human trajectory prediction. The experiments revealed that CNN is more suitable for tasks with strict training time requirements, while the Transformer model excels in tasks that demand higher accuracy. Moreover, experiments using the Transformer model with the optimal hyperparameter configuration, showed that the FDP-trained model achieved a Mean Absolute Error (MAE) of 84.1 pixels, compared to 158.9 pixels for the FPP-trained model, indicating that the FDP dataset, due to reduced self-motion noise, serves as a more suitable input. Furthermore, in scenarios where the image of human operators is incomplete due to occlusion, the Transformer model trained on sub-dataset where humans are occluded had an MAE of 180.7 pixels, while the model trained on the sub- dataset of human movement without occlusion had an MAE of 90.4 pixels, highlighting the challenges posed by occlusion in industrial environments. In the ablation study, different combinations of features—key points + pose, key points + ego motion, and key points + pose + ego motion—were used as inputs to the Transformer model. The results showed that the model trained with key points + pose achieved a Mean Absolute Error (MAE) of 11.82 pixels, the model trained with key points + ego motion had an MAE of 37.04 pixels, and the model trained with key points + pose + ego motion produced the lowest MAE of 10.79 pixels. All of these combinations significantly outperformed the model trained solely on trajectory, which had an MAE of 83.98 pixels. These results confirm that the inclusion of the pose feature plays a crucial role in improving the accuracy of the Transformer-based human trajectory prediction model, making it a key feature for enhancing predictive performance in industrial environments. Master's degree 2024-12-02T02:05:20Z 2024-12-02T02:05:20Z 2024 Thesis-Master by Coursework Zhang, L. (2024). Study on quality evaluation and model testing of human motion dataset under manufacturing scenario. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181401 https://hdl.handle.net/10356/181401 en application/pdf Nanyang Technological University
spellingShingle Engineering
Human trajectory prediction
Industrial environment
Convolutional neural network (CNN)
Transformer model
Human-robot collaboration (HRC)
Zhang, Li
Study on quality evaluation and model testing of human motion dataset under manufacturing scenario
title Study on quality evaluation and model testing of human motion dataset under manufacturing scenario
title_full Study on quality evaluation and model testing of human motion dataset under manufacturing scenario
title_fullStr Study on quality evaluation and model testing of human motion dataset under manufacturing scenario
title_full_unstemmed Study on quality evaluation and model testing of human motion dataset under manufacturing scenario
title_short Study on quality evaluation and model testing of human motion dataset under manufacturing scenario
title_sort study on quality evaluation and model testing of human motion dataset under manufacturing scenario
topic Engineering
Human trajectory prediction
Industrial environment
Convolutional neural network (CNN)
Transformer model
Human-robot collaboration (HRC)
url https://hdl.handle.net/10356/181401
work_keys_str_mv AT zhangli studyonqualityevaluationandmodeltestingofhumanmotiondatasetundermanufacturingscenario