Optimizing trajectories for highway driving with offline reinforcement learning

Achieving feasible, smooth and efficient trajectories for autonomous vehicles which appropriately take into account the long-term future while planning, has been a long-standing challenge. Several approaches have been considered, roughly falling under two categories: rule-based and learning-based ap...

Full description

Bibliographic Details
Main Authors: Branka Mirchevska, Moritz Werling, Joschka Boedecker
Format: Article
Language:English
Published: Frontiers Media S.A. 2023-05-01
Series:Frontiers in Future Transportation
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/ffutr.2023.1076439/full
_version_ 1797397071037202432
author Branka Mirchevska
Moritz Werling
Joschka Boedecker
Joschka Boedecker
author_facet Branka Mirchevska
Moritz Werling
Joschka Boedecker
Joschka Boedecker
author_sort Branka Mirchevska
collection DOAJ
description Achieving feasible, smooth and efficient trajectories for autonomous vehicles which appropriately take into account the long-term future while planning, has been a long-standing challenge. Several approaches have been considered, roughly falling under two categories: rule-based and learning-based approaches. The rule-based approaches, while guaranteeing safety and feasibility, fall short when it comes to long-term planning and generalization. The learning-based approaches are able to account for long-term planning and generalization to unseen situations, but may fail to achieve smoothness, safety and the feasibility which rule-based approaches ensure. Hence, combining the two approaches is an evident step towards yielding the best compromise out of both. We propose a Reinforcement Learning-based approach, which learns target trajectory parameters for fully autonomous driving on highways. The trained agent outputs continuous trajectory parameters based on which a feasible polynomial-based trajectory is generated and executed. We compare the performance of our agent against four other highway driving agents. The experiments are conducted in the Sumo simulator, taking into consideration various realistic, dynamically changing highway scenarios, including surrounding vehicles with different driver behaviors. We demonstrate that our offline trained agent, with randomly collected data, learns to drive smoothly, achieving velocities as close as possible to the desired velocity, while outperforming the other agents.
first_indexed 2024-03-09T01:04:40Z
format Article
id doaj.art-6621013391f345dd85d55b79a5c3a31b
institution Directory Open Access Journal
issn 2673-5210
language English
last_indexed 2024-03-09T01:04:40Z
publishDate 2023-05-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Future Transportation
spelling doaj.art-6621013391f345dd85d55b79a5c3a31b2023-12-11T10:14:06ZengFrontiers Media S.A.Frontiers in Future Transportation2673-52102023-05-01410.3389/ffutr.2023.10764391076439Optimizing trajectories for highway driving with offline reinforcement learningBranka Mirchevska0Moritz Werling1Joschka Boedecker2Joschka Boedecker3Department of Computer Science, University of Freiburg, Freiburg, GermanyBMW Group, Munich, GermanyDepartment of Computer Science, University of Freiburg, Freiburg, GermanyIMBIT // BrainLinks-BrainTools, University of Freiburg, Freiburg, GermanyAchieving feasible, smooth and efficient trajectories for autonomous vehicles which appropriately take into account the long-term future while planning, has been a long-standing challenge. Several approaches have been considered, roughly falling under two categories: rule-based and learning-based approaches. The rule-based approaches, while guaranteeing safety and feasibility, fall short when it comes to long-term planning and generalization. The learning-based approaches are able to account for long-term planning and generalization to unseen situations, but may fail to achieve smoothness, safety and the feasibility which rule-based approaches ensure. Hence, combining the two approaches is an evident step towards yielding the best compromise out of both. We propose a Reinforcement Learning-based approach, which learns target trajectory parameters for fully autonomous driving on highways. The trained agent outputs continuous trajectory parameters based on which a feasible polynomial-based trajectory is generated and executed. We compare the performance of our agent against four other highway driving agents. The experiments are conducted in the Sumo simulator, taking into consideration various realistic, dynamically changing highway scenarios, including surrounding vehicles with different driver behaviors. We demonstrate that our offline trained agent, with randomly collected data, learns to drive smoothly, achieving velocities as close as possible to the desired velocity, while outperforming the other agents.https://www.frontiersin.org/articles/10.3389/ffutr.2023.1076439/fullreinforcement learningtrajectory optimizationautonomous drivingoffline reinforcement learningcontinuous control
spellingShingle Branka Mirchevska
Moritz Werling
Joschka Boedecker
Joschka Boedecker
Optimizing trajectories for highway driving with offline reinforcement learning
Frontiers in Future Transportation
reinforcement learning
trajectory optimization
autonomous driving
offline reinforcement learning
continuous control
title Optimizing trajectories for highway driving with offline reinforcement learning
title_full Optimizing trajectories for highway driving with offline reinforcement learning
title_fullStr Optimizing trajectories for highway driving with offline reinforcement learning
title_full_unstemmed Optimizing trajectories for highway driving with offline reinforcement learning
title_short Optimizing trajectories for highway driving with offline reinforcement learning
title_sort optimizing trajectories for highway driving with offline reinforcement learning
topic reinforcement learning
trajectory optimization
autonomous driving
offline reinforcement learning
continuous control
url https://www.frontiersin.org/articles/10.3389/ffutr.2023.1076439/full
work_keys_str_mv AT brankamirchevska optimizingtrajectoriesforhighwaydrivingwithofflinereinforcementlearning
AT moritzwerling optimizingtrajectoriesforhighwaydrivingwithofflinereinforcementlearning
AT joschkaboedecker optimizingtrajectoriesforhighwaydrivingwithofflinereinforcementlearning
AT joschkaboedecker optimizingtrajectoriesforhighwaydrivingwithofflinereinforcementlearning