Real-Time Human Tracking Using Multi-Features Visual With CNN-LSTM and Q-Learning

Various methods are employed in computer vision applications to identify individuals, including using face recognition as a human visual feature helpful in tracking or searching for a person. However, tracking systems that rely solely on facial information encounter limitations, particularly when fa...

Full description

Bibliographic Details
Main Authors:	Devira Anggi Maharani, Carmadi Machbub, Pranoto Hidaya Rusmin, Lenni Yulianti
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Face and body visual features CNN LSTM Q-learning real-time
Online Access:	https://ieeexplore.ieee.org/document/10403888/

_version_	1827369187502194688
author	Devira Anggi Maharani Carmadi Machbub Pranoto Hidaya Rusmin Lenni Yulianti
author_facet	Devira Anggi Maharani Carmadi Machbub Pranoto Hidaya Rusmin Lenni Yulianti
author_sort	Devira Anggi Maharani
collection	DOAJ
description	Various methods are employed in computer vision applications to identify individuals, including using face recognition as a human visual feature helpful in tracking or searching for a person. However, tracking systems that rely solely on facial information encounter limitations, particularly when faced with occlusions, blurred images, or faces oriented away from the camera. Under these conditions, the system struggles to achieve accurate tracking-based face recognition. Therefore, this research addresses this issue by fusing descriptions of the face visual with body visual features. When the system cannot find the target face, the CNN+LSTM hybrid method assists in multi-feature body visual recognition, narrowing the search space and speeding up the search process. The results indicate that the combination of the CNN+LSTM method yields higher accuracy, recall, precision, and F1 scores (reaching 89.20%, 87.36%, 91.02%, and 88.43%, respectively) compared to the single CNN method (reaching 88.84%, 74.00%, 67.00%, and 69.00% respectively). However, the combination of these two visual features requires high computation. Thus, it is necessary to add a tracking system to reduce the computational load and predict the location. Furthermore, this research utilizes the Q-Learning algorithm to make optimal decisions in automatically tracking objects in dynamic environments. The system considers factors such as face and body visual features, object location, and environmental conditions to make the best decisions, aiming to enhance tracking efficiency and accuracy. Based on the conducted experiments, it is concluded that the system can adjust its actions in response to environmental changes with better outcomes. It achieves an accuracy rate of 91.5% and an average of 50 fps in five different videos, as well as a video benchmark dataset with an accuracy of 84% and an average error of 11.15 pixels. Utilizing the proposed method speeds up the search process and optimizes tracking decisions, saving time and computational resources.
first_indexed	2024-03-08T09:42:40Z
format	Article
id	doaj.art-5b6a9ed64b5a4e65acce9b18fdfc262e
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-03-08T09:42:40Z
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-5b6a9ed64b5a4e65acce9b18fdfc262e2024-01-30T00:02:50ZengIEEEIEEE Access2169-35362024-01-0112132331324710.1109/ACCESS.2024.335578510403888Real-Time Human Tracking Using Multi-Features Visual With CNN-LSTM and Q-LearningDevira Anggi Maharani0https://orcid.org/0000-0002-6645-5142Carmadi Machbub1Pranoto Hidaya Rusmin2https://orcid.org/0009-0001-3003-6288Lenni Yulianti3School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, IndonesiaSchool of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, IndonesiaSchool of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, IndonesiaSchool of Electrical Engineering and Informatics, Institut Teknologi Bandung, Bandung, IndonesiaVarious methods are employed in computer vision applications to identify individuals, including using face recognition as a human visual feature helpful in tracking or searching for a person. However, tracking systems that rely solely on facial information encounter limitations, particularly when faced with occlusions, blurred images, or faces oriented away from the camera. Under these conditions, the system struggles to achieve accurate tracking-based face recognition. Therefore, this research addresses this issue by fusing descriptions of the face visual with body visual features. When the system cannot find the target face, the CNN+LSTM hybrid method assists in multi-feature body visual recognition, narrowing the search space and speeding up the search process. The results indicate that the combination of the CNN+LSTM method yields higher accuracy, recall, precision, and F1 scores (reaching 89.20%, 87.36%, 91.02%, and 88.43%, respectively) compared to the single CNN method (reaching 88.84%, 74.00%, 67.00%, and 69.00% respectively). However, the combination of these two visual features requires high computation. Thus, it is necessary to add a tracking system to reduce the computational load and predict the location. Furthermore, this research utilizes the Q-Learning algorithm to make optimal decisions in automatically tracking objects in dynamic environments. The system considers factors such as face and body visual features, object location, and environmental conditions to make the best decisions, aiming to enhance tracking efficiency and accuracy. Based on the conducted experiments, it is concluded that the system can adjust its actions in response to environmental changes with better outcomes. It achieves an accuracy rate of 91.5% and an average of 50 fps in five different videos, as well as a video benchmark dataset with an accuracy of 84% and an average error of 11.15 pixels. Utilizing the proposed method speeds up the search process and optimizes tracking decisions, saving time and computational resources.https://ieeexplore.ieee.org/document/10403888/Face and body visual featuresCNNLSTMQ-learningreal-time
spellingShingle	Devira Anggi Maharani Carmadi Machbub Pranoto Hidaya Rusmin Lenni Yulianti Real-Time Human Tracking Using Multi-Features Visual With CNN-LSTM and Q-Learning IEEE Access Face and body visual features CNN LSTM Q-learning real-time
title	Real-Time Human Tracking Using Multi-Features Visual With CNN-LSTM and Q-Learning
title_full	Real-Time Human Tracking Using Multi-Features Visual With CNN-LSTM and Q-Learning
title_fullStr	Real-Time Human Tracking Using Multi-Features Visual With CNN-LSTM and Q-Learning
title_full_unstemmed	Real-Time Human Tracking Using Multi-Features Visual With CNN-LSTM and Q-Learning
title_short	Real-Time Human Tracking Using Multi-Features Visual With CNN-LSTM and Q-Learning
title_sort	real time human tracking using multi features visual with cnn lstm and q learning
topic	Face and body visual features CNN LSTM Q-learning real-time
url	https://ieeexplore.ieee.org/document/10403888/
work_keys_str_mv	AT deviraanggimaharani realtimehumantrackingusingmultifeaturesvisualwithcnnlstmandqlearning AT carmadimachbub realtimehumantrackingusingmultifeaturesvisualwithcnnlstmandqlearning AT pranotohidayarusmin realtimehumantrackingusingmultifeaturesvisualwithcnnlstmandqlearning AT lenniyulianti realtimehumantrackingusingmultifeaturesvisualwithcnnlstmandqlearning

Real-Time Human Tracking Using Multi-Features Visual With CNN-LSTM and Q-Learning

Similar Items