Leveraging transfer learning for spatio-temporal human activity recognition from video sequences

Human Activity Recognition (HAR) is an active research area due to its applications in pervasive computing, human-computer interaction, artificial intelligence, health care, and social sciences.Moreover, dynamic environments and anthropometric differences between individuals make it harder to recogn...

Full description

Bibliographic Details
Main Authors:	Muneer Butt, Umair, Aman Ullah, Hadiqa, Letchmunan, Sukumar, Tariq, Iqra, Hafinaz Hassan, Fadratul, Wei Koh, Tieng
Format:	Article
Language:	English
Published:	Tech Science Press 2023
Online Access:	http://psasir.upm.edu.my/id/eprint/109555/1/TSP_CMC_35512.pdf

_version_	1824452251979087872
author	Muneer Butt, Umair Aman Ullah, Hadiqa Letchmunan, Sukumar Tariq, Iqra Hafinaz Hassan, Fadratul Wei Koh, Tieng
author_facet	Muneer Butt, Umair Aman Ullah, Hadiqa Letchmunan, Sukumar Tariq, Iqra Hafinaz Hassan, Fadratul Wei Koh, Tieng
author_sort	Muneer Butt, Umair
collection	UPM
description	Human Activity Recognition (HAR) is an active research area due to its applications in pervasive computing, human-computer interaction, artificial intelligence, health care, and social sciences.Moreover, dynamic environments and anthropometric differences between individuals make it harder to recognize actions. This study focused on human activity in video sequences acquired with an RGB camera because of its vast range of real-world applications. It uses two-stream ConvNet to extract spatial and temporal information and proposes a fine-tuned deep neural network. Moreover, the transfer learning paradigm is adopted to extract varied and fixed frames while reusing object identification information. Six state-of-the-art pre-trained models are exploited to find the best model for spatial feature extraction. For temporal sequence, this study uses dense optical flow following the two-stream ConvNet and Bidirectional Long Short TermMemory (BiLSTM) to capture longtermdependencies. Two state-of-the-art datasets, UCF101 andHMDB51, are used for evaluation purposes. In addition, seven state-of-the-art optimizers are used to fine-tune the proposed network parameters. Furthermore, this study utilizes an ensemble mechanism to aggregate spatial-temporal features using a four-stream Convolutional Neural Network (CNN), where two streams use RGB data. In contrast, the other uses optical flow images. Finally, the proposed ensemble approach using max hard voting outperforms state-ofthe- art methods with 96.30 and 90.07 accuracies on the UCF101 and HMDB51 datasets.
first_indexed	2025-02-19T02:47:34Z
format	Article
id	upm.eprints-109555
institution	Universiti Putra Malaysia
language	English
last_indexed	2025-02-19T02:47:34Z
publishDate	2023
publisher	Tech Science Press
record_format	dspace
spelling	upm.eprints-1095552024-12-17T04:00:26Z http://psasir.upm.edu.my/id/eprint/109555/ Leveraging transfer learning for spatio-temporal human activity recognition from video sequences Muneer Butt, Umair Aman Ullah, Hadiqa Letchmunan, Sukumar Tariq, Iqra Hafinaz Hassan, Fadratul Wei Koh, Tieng Human Activity Recognition (HAR) is an active research area due to its applications in pervasive computing, human-computer interaction, artificial intelligence, health care, and social sciences.Moreover, dynamic environments and anthropometric differences between individuals make it harder to recognize actions. This study focused on human activity in video sequences acquired with an RGB camera because of its vast range of real-world applications. It uses two-stream ConvNet to extract spatial and temporal information and proposes a fine-tuned deep neural network. Moreover, the transfer learning paradigm is adopted to extract varied and fixed frames while reusing object identification information. Six state-of-the-art pre-trained models are exploited to find the best model for spatial feature extraction. For temporal sequence, this study uses dense optical flow following the two-stream ConvNet and Bidirectional Long Short TermMemory (BiLSTM) to capture longtermdependencies. Two state-of-the-art datasets, UCF101 andHMDB51, are used for evaluation purposes. In addition, seven state-of-the-art optimizers are used to fine-tune the proposed network parameters. Furthermore, this study utilizes an ensemble mechanism to aggregate spatial-temporal features using a four-stream Convolutional Neural Network (CNN), where two streams use RGB data. In contrast, the other uses optical flow images. Finally, the proposed ensemble approach using max hard voting outperforms state-ofthe- art methods with 96.30 and 90.07 accuracies on the UCF101 and HMDB51 datasets. Tech Science Press 2023 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/109555/1/TSP_CMC_35512.pdf Muneer Butt, Umair and Aman Ullah, Hadiqa and Letchmunan, Sukumar and Tariq, Iqra and Hafinaz Hassan, Fadratul and Wei Koh, Tieng (2023) Leveraging transfer learning for spatio-temporal human activity recognition from video sequences. Computers, Materials and Continua, 74 (3). pp. 5017-5033. ISSN 1546-2218; eISSN: 1546-2226 https://www.techscience.com/cmc/v74n3/50975 10.32604/cmc.2023.035512
spellingShingle	Muneer Butt, Umair Aman Ullah, Hadiqa Letchmunan, Sukumar Tariq, Iqra Hafinaz Hassan, Fadratul Wei Koh, Tieng Leveraging transfer learning for spatio-temporal human activity recognition from video sequences
title	Leveraging transfer learning for spatio-temporal human activity recognition from video sequences
title_full	Leveraging transfer learning for spatio-temporal human activity recognition from video sequences
title_fullStr	Leveraging transfer learning for spatio-temporal human activity recognition from video sequences
title_full_unstemmed	Leveraging transfer learning for spatio-temporal human activity recognition from video sequences
title_short	Leveraging transfer learning for spatio-temporal human activity recognition from video sequences
title_sort	leveraging transfer learning for spatio temporal human activity recognition from video sequences
url	http://psasir.upm.edu.my/id/eprint/109555/1/TSP_CMC_35512.pdf
work_keys_str_mv	AT muneerbuttumair leveragingtransferlearningforspatiotemporalhumanactivityrecognitionfromvideosequences AT amanullahhadiqa leveragingtransferlearningforspatiotemporalhumanactivityrecognitionfromvideosequences AT letchmunansukumar leveragingtransferlearningforspatiotemporalhumanactivityrecognitionfromvideosequences AT tariqiqra leveragingtransferlearningforspatiotemporalhumanactivityrecognitionfromvideosequences AT hafinazhassanfadratul leveragingtransferlearningforspatiotemporalhumanactivityrecognitionfromvideosequences AT weikohtieng leveragingtransferlearningforspatiotemporalhumanactivityrecognitionfromvideosequences

Leveraging transfer learning for spatio-temporal human activity recognition from video sequences

Similar Items