Trustworthy autonomous driving via defense-aware robust reinforcement learning against worst-case observational perturbations
Despite the substantial advancements in reinforcement learning (RL) in recent years, ensuring trustworthiness remains a formidable challenge when applying this technology to safety-critical autonomous driving domains. One pivotal bottleneck is that well-trained driving policy models may be particula...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Journal Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/179385 |
_version_ | 1811684229882314752 |
---|---|
author | He, Xiangkun Huang, Wenhui Lv, Chen |
author2 | School of Mechanical and Aerospace Engineering |
author_facet | School of Mechanical and Aerospace Engineering He, Xiangkun Huang, Wenhui Lv, Chen |
author_sort | He, Xiangkun |
collection | NTU |
description | Despite the substantial advancements in reinforcement learning (RL) in recent years, ensuring trustworthiness remains a formidable challenge when applying this technology to safety-critical autonomous driving domains. One pivotal bottleneck is that well-trained driving policy models may be particularly vulnerable to observational perturbations or perceptual uncertainties, potentially leading to severe failures. In view of this, we present a novel defense-aware robust RL approach tailored for ensuring the robustness and safety of autonomous vehicles in the face of worst-case attacks on observations. The proposed paradigm primarily comprises two crucial modules: an adversarial attacker and a robust defender. Specifically, the adversarial attacker is devised to approximate the worst-case observational perturbations that attempt to induce safety violations (e.g., collisions) in the RL-driven autonomous vehicle. Additionally, the robust defender is developed to facilitate the safe RL agent to learn robust optimal policies that maximize the return while constraining the policy and cost perturbed by the adversarial attacker within specified bounds. Finally, the proposed technique is assessed across three distinct traffic scenarios: highway, on-ramp, and intersection. The simulation and experimental results indicate that our scheme enables the agent to execute trustworthy driving policies, even in the presence of the worst-case observational perturbations. |
first_indexed | 2024-10-01T04:25:19Z |
format | Journal Article |
id | ntu-10356/179385 |
institution | Nanyang Technological University |
language | English |
last_indexed | 2024-10-01T04:25:19Z |
publishDate | 2024 |
record_format | dspace |
spelling | ntu-10356/1793852024-07-29T05:25:00Z Trustworthy autonomous driving via defense-aware robust reinforcement learning against worst-case observational perturbations He, Xiangkun Huang, Wenhui Lv, Chen School of Mechanical and Aerospace Engineering Engineering Autonomous vehicle Traffic safety Despite the substantial advancements in reinforcement learning (RL) in recent years, ensuring trustworthiness remains a formidable challenge when applying this technology to safety-critical autonomous driving domains. One pivotal bottleneck is that well-trained driving policy models may be particularly vulnerable to observational perturbations or perceptual uncertainties, potentially leading to severe failures. In view of this, we present a novel defense-aware robust RL approach tailored for ensuring the robustness and safety of autonomous vehicles in the face of worst-case attacks on observations. The proposed paradigm primarily comprises two crucial modules: an adversarial attacker and a robust defender. Specifically, the adversarial attacker is devised to approximate the worst-case observational perturbations that attempt to induce safety violations (e.g., collisions) in the RL-driven autonomous vehicle. Additionally, the robust defender is developed to facilitate the safe RL agent to learn robust optimal policies that maximize the return while constraining the policy and cost perturbed by the adversarial attacker within specified bounds. Finally, the proposed technique is assessed across three distinct traffic scenarios: highway, on-ramp, and intersection. The simulation and experimental results indicate that our scheme enables the agent to execute trustworthy driving policies, even in the presence of the worst-case observational perturbations. Agency for Science, Technology and Research (A*STAR) Ministry of Education (MOE) National Research Foundation (NRF) This work was supported in part by the Agency for Science, Technology and Research (A*STAR), Singapore, under Advanced Manufacturing and Engineering (AME) Young Individual Research under Grant A2084c0156, the MTC Individual Research Grant (M22K2c0079), the ANR-NRF Joint Grant (No. NRF2021-NRF-ANR003 HM Science), and the Ministry of Education (MOE), Singapore, under the Tier 2 Grant (MOE-T2EP50222-0002). 2024-07-29T05:25:00Z 2024-07-29T05:25:00Z 2024 Journal Article He, X., Huang, W. & Lv, C. (2024). Trustworthy autonomous driving via defense-aware robust reinforcement learning against worst-case observational perturbations. Transportation Research Part C: Emerging Technologies, 163, 104632-. https://dx.doi.org/10.1016/j.trc.2024.104632 0968-090X https://hdl.handle.net/10356/179385 10.1016/j.trc.2024.104632 2-s2.0-85191985184 163 104632 en A2084c0156 M22K2c0079 NRF2021-NRF-ANR003 HM Science MOE-T2EP50222-0002 Transportation Research Part C: Emerging Technologies © 2024 Elsevier Ltd. All rights reserved. |
spellingShingle | Engineering Autonomous vehicle Traffic safety He, Xiangkun Huang, Wenhui Lv, Chen Trustworthy autonomous driving via defense-aware robust reinforcement learning against worst-case observational perturbations |
title | Trustworthy autonomous driving via defense-aware robust reinforcement learning against worst-case observational perturbations |
title_full | Trustworthy autonomous driving via defense-aware robust reinforcement learning against worst-case observational perturbations |
title_fullStr | Trustworthy autonomous driving via defense-aware robust reinforcement learning against worst-case observational perturbations |
title_full_unstemmed | Trustworthy autonomous driving via defense-aware robust reinforcement learning against worst-case observational perturbations |
title_short | Trustworthy autonomous driving via defense-aware robust reinforcement learning against worst-case observational perturbations |
title_sort | trustworthy autonomous driving via defense aware robust reinforcement learning against worst case observational perturbations |
topic | Engineering Autonomous vehicle Traffic safety |
url | https://hdl.handle.net/10356/179385 |
work_keys_str_mv | AT hexiangkun trustworthyautonomousdrivingviadefenseawarerobustreinforcementlearningagainstworstcaseobservationalperturbations AT huangwenhui trustworthyautonomousdrivingviadefenseawarerobustreinforcementlearningagainstworstcaseobservationalperturbations AT lvchen trustworthyautonomousdrivingviadefenseawarerobustreinforcementlearningagainstworstcaseobservationalperturbations |