Optimizing trajectories for highway driving with offline reinforcement learning
Achieving feasible, smooth and efficient trajectories for autonomous vehicles which appropriately take into account the long-term future while planning, has been a long-standing challenge. Several approaches have been considered, roughly falling under two categories: rule-based and learning-based ap...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2023-05-01
|
Series: | Frontiers in Future Transportation |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/ffutr.2023.1076439/full |
_version_ | 1797397071037202432 |
---|---|
author | Branka Mirchevska Moritz Werling Joschka Boedecker Joschka Boedecker |
author_facet | Branka Mirchevska Moritz Werling Joschka Boedecker Joschka Boedecker |
author_sort | Branka Mirchevska |
collection | DOAJ |
description | Achieving feasible, smooth and efficient trajectories for autonomous vehicles which appropriately take into account the long-term future while planning, has been a long-standing challenge. Several approaches have been considered, roughly falling under two categories: rule-based and learning-based approaches. The rule-based approaches, while guaranteeing safety and feasibility, fall short when it comes to long-term planning and generalization. The learning-based approaches are able to account for long-term planning and generalization to unseen situations, but may fail to achieve smoothness, safety and the feasibility which rule-based approaches ensure. Hence, combining the two approaches is an evident step towards yielding the best compromise out of both. We propose a Reinforcement Learning-based approach, which learns target trajectory parameters for fully autonomous driving on highways. The trained agent outputs continuous trajectory parameters based on which a feasible polynomial-based trajectory is generated and executed. We compare the performance of our agent against four other highway driving agents. The experiments are conducted in the Sumo simulator, taking into consideration various realistic, dynamically changing highway scenarios, including surrounding vehicles with different driver behaviors. We demonstrate that our offline trained agent, with randomly collected data, learns to drive smoothly, achieving velocities as close as possible to the desired velocity, while outperforming the other agents. |
first_indexed | 2024-03-09T01:04:40Z |
format | Article |
id | doaj.art-6621013391f345dd85d55b79a5c3a31b |
institution | Directory Open Access Journal |
issn | 2673-5210 |
language | English |
last_indexed | 2024-03-09T01:04:40Z |
publishDate | 2023-05-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Future Transportation |
spelling | doaj.art-6621013391f345dd85d55b79a5c3a31b2023-12-11T10:14:06ZengFrontiers Media S.A.Frontiers in Future Transportation2673-52102023-05-01410.3389/ffutr.2023.10764391076439Optimizing trajectories for highway driving with offline reinforcement learningBranka Mirchevska0Moritz Werling1Joschka Boedecker2Joschka Boedecker3Department of Computer Science, University of Freiburg, Freiburg, GermanyBMW Group, Munich, GermanyDepartment of Computer Science, University of Freiburg, Freiburg, GermanyIMBIT // BrainLinks-BrainTools, University of Freiburg, Freiburg, GermanyAchieving feasible, smooth and efficient trajectories for autonomous vehicles which appropriately take into account the long-term future while planning, has been a long-standing challenge. Several approaches have been considered, roughly falling under two categories: rule-based and learning-based approaches. The rule-based approaches, while guaranteeing safety and feasibility, fall short when it comes to long-term planning and generalization. The learning-based approaches are able to account for long-term planning and generalization to unseen situations, but may fail to achieve smoothness, safety and the feasibility which rule-based approaches ensure. Hence, combining the two approaches is an evident step towards yielding the best compromise out of both. We propose a Reinforcement Learning-based approach, which learns target trajectory parameters for fully autonomous driving on highways. The trained agent outputs continuous trajectory parameters based on which a feasible polynomial-based trajectory is generated and executed. We compare the performance of our agent against four other highway driving agents. The experiments are conducted in the Sumo simulator, taking into consideration various realistic, dynamically changing highway scenarios, including surrounding vehicles with different driver behaviors. We demonstrate that our offline trained agent, with randomly collected data, learns to drive smoothly, achieving velocities as close as possible to the desired velocity, while outperforming the other agents.https://www.frontiersin.org/articles/10.3389/ffutr.2023.1076439/fullreinforcement learningtrajectory optimizationautonomous drivingoffline reinforcement learningcontinuous control |
spellingShingle | Branka Mirchevska Moritz Werling Joschka Boedecker Joschka Boedecker Optimizing trajectories for highway driving with offline reinforcement learning Frontiers in Future Transportation reinforcement learning trajectory optimization autonomous driving offline reinforcement learning continuous control |
title | Optimizing trajectories for highway driving with offline reinforcement learning |
title_full | Optimizing trajectories for highway driving with offline reinforcement learning |
title_fullStr | Optimizing trajectories for highway driving with offline reinforcement learning |
title_full_unstemmed | Optimizing trajectories for highway driving with offline reinforcement learning |
title_short | Optimizing trajectories for highway driving with offline reinforcement learning |
title_sort | optimizing trajectories for highway driving with offline reinforcement learning |
topic | reinforcement learning trajectory optimization autonomous driving offline reinforcement learning continuous control |
url | https://www.frontiersin.org/articles/10.3389/ffutr.2023.1076439/full |
work_keys_str_mv | AT brankamirchevska optimizingtrajectoriesforhighwaydrivingwithofflinereinforcementlearning AT moritzwerling optimizingtrajectoriesforhighwaydrivingwithofflinereinforcementlearning AT joschkaboedecker optimizingtrajectoriesforhighwaydrivingwithofflinereinforcementlearning AT joschkaboedecker optimizingtrajectoriesforhighwaydrivingwithofflinereinforcementlearning |