Safe reinforcement learning in automotive

This work applies Logically Constrained Reinforcement Learning (LCRL) frame-work for synthesizing policies for active monitoring and management of sensor modules in autonomous vehicles. LCRL allows synthesizing policies for unknown and continuous-state Markov Decision Processes (MDPs) such that a gi...

Full description

Bibliographic Details
Main Author: Shah, A
Other Authors: Abate, A
Format: Thesis
Language:English
Published: 2020
Subjects:
_version_ 1824458697121726464
author Shah, A
author2 Abate, A
author_facet Abate, A
Shah, A
author_sort Shah, A
collection OXFORD
description This work applies Logically Constrained Reinforcement Learning (LCRL) frame-work for synthesizing policies for active monitoring and management of sensor modules in autonomous vehicles. LCRL allows synthesizing policies for unknown and continuous-state Markov Decision Processes (MDPs) such that a given linear time property is satisfied. We frame the problem of dynamic sensor module selection in automotive as an MDP and define the constraints as a Linear Temporal Logic (LTL) property. The LTL properties are used to guard and guide the MDP in finding an optimal policy. Defining a reward function over the state-action pairs of MDP based on the LTL property results in learning in a constrained environment. This approach leads to an improvement in performance and scalability for finding the right monitoring policies, as the safety properties enforces bounds on the search space.
first_indexed 2025-02-19T04:30:00Z
format Thesis
id oxford-uuid:726e5248-e784-420d-b712-4e1ce4de55af
institution University of Oxford
language English
last_indexed 2025-02-19T04:30:00Z
publishDate 2020
record_format dspace
spelling oxford-uuid:726e5248-e784-420d-b712-4e1ce4de55af2025-01-02T13:03:27ZSafe reinforcement learning in automotiveThesishttp://purl.org/coar/resource_type/c_bdccuuid:726e5248-e784-420d-b712-4e1ce4de55afArtificial intelligenceLogic, Symbolic and mathematicalReinforcement LearningDeep learning (Machine learning)Automated vehiclesEnglishHyrax Deposit2020Shah, AAbate, AHasanbeig, HPathak, SThis work applies Logically Constrained Reinforcement Learning (LCRL) frame-work for synthesizing policies for active monitoring and management of sensor modules in autonomous vehicles. LCRL allows synthesizing policies for unknown and continuous-state Markov Decision Processes (MDPs) such that a given linear time property is satisfied. We frame the problem of dynamic sensor module selection in automotive as an MDP and define the constraints as a Linear Temporal Logic (LTL) property. The LTL properties are used to guard and guide the MDP in finding an optimal policy. Defining a reward function over the state-action pairs of MDP based on the LTL property results in learning in a constrained environment. This approach leads to an improvement in performance and scalability for finding the right monitoring policies, as the safety properties enforces bounds on the search space.
spellingShingle Artificial intelligence
Logic, Symbolic and mathematical
Reinforcement Learning
Deep learning (Machine learning)
Automated vehicles
Shah, A
Safe reinforcement learning in automotive
title Safe reinforcement learning in automotive
title_full Safe reinforcement learning in automotive
title_fullStr Safe reinforcement learning in automotive
title_full_unstemmed Safe reinforcement learning in automotive
title_short Safe reinforcement learning in automotive
title_sort safe reinforcement learning in automotive
topic Artificial intelligence
Logic, Symbolic and mathematical
Reinforcement Learning
Deep learning (Machine learning)
Automated vehicles
work_keys_str_mv AT shaha safereinforcementlearninginautomotive