Performance analysis of a hybrid agent for quantum-accessible reinforcement learning

In the last decade quantum machine learning has provided fascinating and fundamental improvements to supervised, unsupervised and reinforcement learning (RL). In RL, a so-called agent is challenged to solve a task given by some environment. The agent learns to solve the task by exploring the environ...

Full description

Bibliographic Details
Main Authors:	Arne Hamann, Sabine Wölk
Format:	Article
Language:	English
Published:	IOP Publishing 2022-01-01
Series:	New Journal of Physics
Subjects:	quantum reinforcement learning reinforcement learning amplitude amplification hybrid quantum–classical algorithm quantum search
Online Access:	https://doi.org/10.1088/1367-2630/ac5b56

_version_	1827871201045774336
author	Arne Hamann Sabine Wölk
author_facet	Arne Hamann Sabine Wölk
author_sort	Arne Hamann
collection	DOAJ
description	In the last decade quantum machine learning has provided fascinating and fundamental improvements to supervised, unsupervised and reinforcement learning (RL). In RL, a so-called agent is challenged to solve a task given by some environment. The agent learns to solve the task by exploring the environment and exploiting the rewards it gets from the environment. For some classical task environments, an analogue quantum environment can be constructed which allows to find rewards quadratically faster by applying quantum algorithms. In this paper, we analytically analyze the behavior of a hybrid agent which combines this quadratic speedup in exploration with the policy update of a classical agent. This leads to a faster learning of the hybrid agent compared to the classical agent. We demonstrate that if the classical agent needs on average ⟨ J ⟩ rewards and ⟨ T ⟩ _cl epochs to learn how to solve the task, the hybrid agent will take ${\langle T\rangle }_{\mathrm{q}}\leqslant {\alpha }_{s}{\alpha }_{o}\sqrt{{\langle T\rangle }_{\mathrm{c}\mathrm{l}}\langle J\rangle }$ epochs on average. Here, α _s and α _o denote constants depending on details of the quantum search and are independent of the problem size. Additionally, we prove that if the environment allows for maximally α _o k _max sequential coherent interactions, e.g. due to noise effects, an improvement given by ⟨ T ⟩ _q ≈ α _o ⟨ T ⟩ _cl /(4 k _max ) is still possible.
first_indexed	2024-03-12T16:06:15Z
format	Article
id	doaj.art-173a314c00ee46448751443bdc1a61f6
institution	Directory Open Access Journal
issn	1367-2630
language	English
last_indexed	2024-03-12T16:06:15Z
publishDate	2022-01-01
publisher	IOP Publishing
record_format	Article
series	New Journal of Physics
spelling	doaj.art-173a314c00ee46448751443bdc1a61f62023-08-09T14:18:48ZengIOP PublishingNew Journal of Physics1367-26302022-01-0124303304410.1088/1367-2630/ac5b56Performance analysis of a hybrid agent for quantum-accessible reinforcement learningArne Hamann0https://orcid.org/0000-0002-9016-3641Sabine Wölk1https://orcid.org/0000-0001-9137-4814Institut für Theoretische Physik, Universität Innsbruck , Technikerstraße 21a, 6020 Innsbruck, AustriaInstitut für Theoretische Physik, Universität Innsbruck , Technikerstraße 21a, 6020 Innsbruck, Austria; Institute of Quantum Technologies , German Aerospace Center (DLR), D-89081 Ulm, GermanyIn the last decade quantum machine learning has provided fascinating and fundamental improvements to supervised, unsupervised and reinforcement learning (RL). In RL, a so-called agent is challenged to solve a task given by some environment. The agent learns to solve the task by exploring the environment and exploiting the rewards it gets from the environment. For some classical task environments, an analogue quantum environment can be constructed which allows to find rewards quadratically faster by applying quantum algorithms. In this paper, we analytically analyze the behavior of a hybrid agent which combines this quadratic speedup in exploration with the policy update of a classical agent. This leads to a faster learning of the hybrid agent compared to the classical agent. We demonstrate that if the classical agent needs on average ⟨ J ⟩ rewards and ⟨ T ⟩ _cl epochs to learn how to solve the task, the hybrid agent will take ${\langle T\rangle }_{\mathrm{q}}\leqslant {\alpha }_{s}{\alpha }_{o}\sqrt{{\langle T\rangle }_{\mathrm{c}\mathrm{l}}\langle J\rangle }$ epochs on average. Here, α _s and α _o denote constants depending on details of the quantum search and are independent of the problem size. Additionally, we prove that if the environment allows for maximally α _o k _max sequential coherent interactions, e.g. due to noise effects, an improvement given by ⟨ T ⟩ _q ≈ α _o ⟨ T ⟩ _cl /(4 k _max ) is still possible.https://doi.org/10.1088/1367-2630/ac5b56quantum reinforcement learningreinforcement learningamplitude amplificationhybrid quantum–classical algorithmquantum search
spellingShingle	Arne Hamann Sabine Wölk Performance analysis of a hybrid agent for quantum-accessible reinforcement learning New Journal of Physics quantum reinforcement learning reinforcement learning amplitude amplification hybrid quantum–classical algorithm quantum search
title	Performance analysis of a hybrid agent for quantum-accessible reinforcement learning
title_full	Performance analysis of a hybrid agent for quantum-accessible reinforcement learning
title_fullStr	Performance analysis of a hybrid agent for quantum-accessible reinforcement learning
title_full_unstemmed	Performance analysis of a hybrid agent for quantum-accessible reinforcement learning
title_short	Performance analysis of a hybrid agent for quantum-accessible reinforcement learning
title_sort	performance analysis of a hybrid agent for quantum accessible reinforcement learning
topic	quantum reinforcement learning reinforcement learning amplitude amplification hybrid quantum–classical algorithm quantum search
url	https://doi.org/10.1088/1367-2630/ac5b56
work_keys_str_mv	AT arnehamann performanceanalysisofahybridagentforquantumaccessiblereinforcementlearning AT sabinewolk performanceanalysisofahybridagentforquantumaccessiblereinforcementlearning

Performance analysis of a hybrid agent for quantum-accessible reinforcement learning

Similar Items