Human-like Attention-Driven Saliency Object Estimation in Dynamic Driving Scenes

Identifying a notable object and predicting its importance in front of a vehicle are crucial for automated systems’ risk assessment and decision making. However, current research has rarely exploited the driver’s attentional characteristics. In this study, we propose an attention-driven saliency obj...

Full description

Bibliographic Details
Main Authors: Lisheng Jin, Bingdong Ji, Baicang Guo
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Machines
Subjects:
Online Access:https://www.mdpi.com/2075-1702/10/12/1172
_version_ 1797456638216503296
author Lisheng Jin
Bingdong Ji
Baicang Guo
author_facet Lisheng Jin
Bingdong Ji
Baicang Guo
author_sort Lisheng Jin
collection DOAJ
description Identifying a notable object and predicting its importance in front of a vehicle are crucial for automated systems’ risk assessment and decision making. However, current research has rarely exploited the driver’s attentional characteristics. In this study, we propose an attention-driven saliency object estimation (SOE) method that uses the attention intensity of the driver as a criterion for determining the salience and importance of objects. First, we design a driver attention prediction (DAP) network with a 2D-3D mixed convolution encoder–decoder structure. Second, we fuse the DAP network with faster R-CNN and YOLOv4 at the feature level and name them SOE-F and SOE-Y, respectively, using a shared-bottom multi-task learning (MTL) architecture. By transferring the spatial features onto the time axis, we are able to eliminate the drawback of the bottom features being extracted repeatedly and achieve a uniform image-video input in SOE-F and SOE-Y. Finally, the parameters in SOE-F and SOE-Y are classified into two categories, domain invariant and domain adaptive, and then the domain-adaptive parameters are trained and optimized. The experimental results on the DADA-2000 dataset demonstrate that the proposed method outperforms the state-of-the-art methods in several evaluation metrics and can more accurately predict driver attention. In addition, driven by a human-like attention mechanism, SOE-F and SOE-Y can identify and detect the salience, category, and location of objects, providing risk assessment and a decision basis for autonomous driving systems.
first_indexed 2024-03-09T16:09:46Z
format Article
id doaj.art-2408448f71954fbeabc2a0c7d3c7782a
institution Directory Open Access Journal
issn 2075-1702
language English
last_indexed 2024-03-09T16:09:46Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Machines
spelling doaj.art-2408448f71954fbeabc2a0c7d3c7782a2023-11-24T16:16:51ZengMDPI AGMachines2075-17022022-12-011012117210.3390/machines10121172Human-like Attention-Driven Saliency Object Estimation in Dynamic Driving ScenesLisheng Jin0Bingdong Ji1Baicang Guo2School of Vehicle and Energy, Yanshan University, Qinhuangdao 066000, ChinaSchool of Vehicle and Energy, Yanshan University, Qinhuangdao 066000, ChinaSchool of Vehicle and Energy, Yanshan University, Qinhuangdao 066000, ChinaIdentifying a notable object and predicting its importance in front of a vehicle are crucial for automated systems’ risk assessment and decision making. However, current research has rarely exploited the driver’s attentional characteristics. In this study, we propose an attention-driven saliency object estimation (SOE) method that uses the attention intensity of the driver as a criterion for determining the salience and importance of objects. First, we design a driver attention prediction (DAP) network with a 2D-3D mixed convolution encoder–decoder structure. Second, we fuse the DAP network with faster R-CNN and YOLOv4 at the feature level and name them SOE-F and SOE-Y, respectively, using a shared-bottom multi-task learning (MTL) architecture. By transferring the spatial features onto the time axis, we are able to eliminate the drawback of the bottom features being extracted repeatedly and achieve a uniform image-video input in SOE-F and SOE-Y. Finally, the parameters in SOE-F and SOE-Y are classified into two categories, domain invariant and domain adaptive, and then the domain-adaptive parameters are trained and optimized. The experimental results on the DADA-2000 dataset demonstrate that the proposed method outperforms the state-of-the-art methods in several evaluation metrics and can more accurately predict driver attention. In addition, driven by a human-like attention mechanism, SOE-F and SOE-Y can identify and detect the salience, category, and location of objects, providing risk assessment and a decision basis for autonomous driving systems.https://www.mdpi.com/2075-1702/10/12/1172saliency object estimationdriver attention predictionobject detectionmulti-task learningdomain adaptive
spellingShingle Lisheng Jin
Bingdong Ji
Baicang Guo
Human-like Attention-Driven Saliency Object Estimation in Dynamic Driving Scenes
Machines
saliency object estimation
driver attention prediction
object detection
multi-task learning
domain adaptive
title Human-like Attention-Driven Saliency Object Estimation in Dynamic Driving Scenes
title_full Human-like Attention-Driven Saliency Object Estimation in Dynamic Driving Scenes
title_fullStr Human-like Attention-Driven Saliency Object Estimation in Dynamic Driving Scenes
title_full_unstemmed Human-like Attention-Driven Saliency Object Estimation in Dynamic Driving Scenes
title_short Human-like Attention-Driven Saliency Object Estimation in Dynamic Driving Scenes
title_sort human like attention driven saliency object estimation in dynamic driving scenes
topic saliency object estimation
driver attention prediction
object detection
multi-task learning
domain adaptive
url https://www.mdpi.com/2075-1702/10/12/1172
work_keys_str_mv AT lishengjin humanlikeattentiondrivensaliencyobjectestimationindynamicdrivingscenes
AT bingdongji humanlikeattentiondrivensaliencyobjectestimationindynamicdrivingscenes
AT baicangguo humanlikeattentiondrivensaliencyobjectestimationindynamicdrivingscenes