Visual analytics using artificial intelligence (multi-modality driver action recognition)

A report published by the National Highway Traffic Safety Administration (NHTSA) in the United States showed that up to 3522 people were killed due to distracted driving. Various driver monitoring system were developed to tackle this issue and potentially saving lives and increasing road safety,...

Full description

Bibliographic Details
Main Author:	Lee, Jaron Jin-An
Other Authors:	Yap Kim Hui
Format:	Final Year Project (FYP)
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Engineering
Online Access:	https://hdl.handle.net/10356/176634

_version_	1811687127765745664
author	Lee, Jaron Jin-An
author2	Yap Kim Hui
author_facet	Yap Kim Hui Lee, Jaron Jin-An
author_sort	Lee, Jaron Jin-An
collection	NTU
description	A report published by the National Highway Traffic Safety Administration (NHTSA) in the United States showed that up to 3522 people were killed due to distracted driving. Various driver monitoring system were developed to tackle this issue and potentially saving lives and increasing road safety, one such system includes a driver video action recognition system. The project aims to develop a robust and stable driver action recognition model utilizing multimodality data streams, including RGB, IR and depth. A literature review was carried out to determine suitable model and dataset for this project. Following model and dataset selection, hyperparameters tuning is conducted to optimize VideoMAE V2 for improved accuracy and efficiency on the Drive&Act (DAA) dataset. Various fusion learning technique were explored and implemented into the system for evaluation. Early fusion achieves an average Top-1 accuracy of 82.40%, while late fusion obtains an average Top-1 accuracy of 84.30% on the test set. Overall, the project demonstrated the capability of incorporating early and late fusion methods with VideoMAE V2 model to achieve satisfactory results. This suggests the potential applicability of this model to different multi-modality action recognition tasks. Future work explores alternative fusion techniques and expanding the model to other driver datasets.
first_indexed	2024-10-01T05:11:23Z
format	Final Year Project (FYP)
id	ntu-10356/176634
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T05:11:23Z
publishDate	2024
publisher	Nanyang Technological University
record_format	dspace
spelling	ntu-10356/1766342024-05-24T15:50:25Z Visual analytics using artificial intelligence (multi-modality driver action recognition) Lee, Jaron Jin-An Yap Kim Hui School of Electrical and Electronic Engineering EKHYap@ntu.edu.sg Engineering A report published by the National Highway Traffic Safety Administration (NHTSA) in the United States showed that up to 3522 people were killed due to distracted driving. Various driver monitoring system were developed to tackle this issue and potentially saving lives and increasing road safety, one such system includes a driver video action recognition system. The project aims to develop a robust and stable driver action recognition model utilizing multimodality data streams, including RGB, IR and depth. A literature review was carried out to determine suitable model and dataset for this project. Following model and dataset selection, hyperparameters tuning is conducted to optimize VideoMAE V2 for improved accuracy and efficiency on the Drive&Act (DAA) dataset. Various fusion learning technique were explored and implemented into the system for evaluation. Early fusion achieves an average Top-1 accuracy of 82.40%, while late fusion obtains an average Top-1 accuracy of 84.30% on the test set. Overall, the project demonstrated the capability of incorporating early and late fusion methods with VideoMAE V2 model to achieve satisfactory results. This suggests the potential applicability of this model to different multi-modality action recognition tasks. Future work explores alternative fusion techniques and expanding the model to other driver datasets. Bachelor's degree 2024-05-19T23:15:40Z 2024-05-19T23:15:40Z 2024 Final Year Project (FYP) Lee, J. J. (2024). Visual analytics using artificial intelligence (multi-modality driver action recognition). Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176634 https://hdl.handle.net/10356/176634 en A3256-231 application/pdf Nanyang Technological University
spellingShingle	Engineering Lee, Jaron Jin-An Visual analytics using artificial intelligence (multi-modality driver action recognition)
title	Visual analytics using artificial intelligence (multi-modality driver action recognition)
title_full	Visual analytics using artificial intelligence (multi-modality driver action recognition)
title_fullStr	Visual analytics using artificial intelligence (multi-modality driver action recognition)
title_full_unstemmed	Visual analytics using artificial intelligence (multi-modality driver action recognition)
title_short	Visual analytics using artificial intelligence (multi-modality driver action recognition)
title_sort	visual analytics using artificial intelligence multi modality driver action recognition
topic	Engineering
url	https://hdl.handle.net/10356/176634
work_keys_str_mv	AT leejaronjinan visualanalyticsusingartificialintelligencemultimodalitydriveractionrecognition

Visual analytics using artificial intelligence (multi-modality driver action recognition)

Similar Items