Automated gadget discovery in the quantum domain
In recent years, reinforcement learning (RL) has become increasingly successful in its application to the quantum domain and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IOP Publishing
2023-01-01
|
Series: | Machine Learning: Science and Technology |
Subjects: | |
Online Access: | https://doi.org/10.1088/2632-2153/acf098 |
_version_ | 1827823885058310144 |
---|---|
author | Lea M Trenkwalder Andrea López-Incera Hendrik Poulsen Nautrup Fulvio Flamini Hans J Briegel |
author_facet | Lea M Trenkwalder Andrea López-Incera Hendrik Poulsen Nautrup Fulvio Flamini Hans J Briegel |
author_sort | Lea M Trenkwalder |
collection | DOAJ |
description | In recent years, reinforcement learning (RL) has become increasingly successful in its application to the quantum domain and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent’s learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two quantum-inspired RL environments. First, we consider simulated quantum optics experiments for the design of high-dimensional multipartite entangled states where the algorithm finds gadgets that correspond to modern interferometer setups. Second, we consider a circuit-based quantum computing environment where the algorithm discovers various gadgets for quantum information processing, such as quantum teleportation. This approach for analyzing the policy of a learned agent is agent and environment agnostic and can yield interesting insights into any agent’s policy. |
first_indexed | 2024-03-12T02:18:10Z |
format | Article |
id | doaj.art-61bf0003a8ea4bb69c5f1e40ea217406 |
institution | Directory Open Access Journal |
issn | 2632-2153 |
language | English |
last_indexed | 2024-03-12T02:18:10Z |
publishDate | 2023-01-01 |
publisher | IOP Publishing |
record_format | Article |
series | Machine Learning: Science and Technology |
spelling | doaj.art-61bf0003a8ea4bb69c5f1e40ea2174062023-09-06T07:17:25ZengIOP PublishingMachine Learning: Science and Technology2632-21532023-01-014303504310.1088/2632-2153/acf098Automated gadget discovery in the quantum domainLea M Trenkwalder0https://orcid.org/0000-0002-5690-707XAndrea López-Incera1https://orcid.org/0000-0002-0522-6610Hendrik Poulsen Nautrup2https://orcid.org/0000-0001-7815-7006Fulvio Flamini3https://orcid.org/0000-0003-4999-2840Hans J Briegel4https://orcid.org/0000-0002-9065-1565University of Innsbruck, Institute for Theoretical Physics , 6020 Innsbruck, AustriaUniversity of Innsbruck, Institute for Theoretical Physics , 6020 Innsbruck, AustriaUniversity of Innsbruck, Institute for Theoretical Physics , 6020 Innsbruck, AustriaUniversity of Innsbruck, Institute for Theoretical Physics , 6020 Innsbruck, AustriaUniversity of Innsbruck, Institute for Theoretical Physics , 6020 Innsbruck, Austria; Department of Philosophy, University of Konstanz , 78457 Konstanz, GermanyIn recent years, reinforcement learning (RL) has become increasingly successful in its application to the quantum domain and the process of scientific discovery in general. However, while RL algorithms learn to solve increasingly complex problems, interpreting the solutions they provide becomes ever more challenging. In this work, we gain insights into an RL agent’s learned behavior through a post-hoc analysis based on sequence mining and clustering. Specifically, frequent and compact subroutines, used by the agent to solve a given task, are distilled as gadgets and then grouped by various metrics. This process of gadget discovery develops in three stages: First, we use an RL agent to generate data, then, we employ a mining algorithm to extract gadgets and finally, the obtained gadgets are grouped by a density-based clustering algorithm. We demonstrate our method by applying it to two quantum-inspired RL environments. First, we consider simulated quantum optics experiments for the design of high-dimensional multipartite entangled states where the algorithm finds gadgets that correspond to modern interferometer setups. Second, we consider a circuit-based quantum computing environment where the algorithm discovers various gadgets for quantum information processing, such as quantum teleportation. This approach for analyzing the policy of a learned agent is agent and environment agnostic and can yield interesting insights into any agent’s policy.https://doi.org/10.1088/2632-2153/acf098reinforcement learningmachine learningsequence miningquantum opticsquantum information |
spellingShingle | Lea M Trenkwalder Andrea López-Incera Hendrik Poulsen Nautrup Fulvio Flamini Hans J Briegel Automated gadget discovery in the quantum domain Machine Learning: Science and Technology reinforcement learning machine learning sequence mining quantum optics quantum information |
title | Automated gadget discovery in the quantum domain |
title_full | Automated gadget discovery in the quantum domain |
title_fullStr | Automated gadget discovery in the quantum domain |
title_full_unstemmed | Automated gadget discovery in the quantum domain |
title_short | Automated gadget discovery in the quantum domain |
title_sort | automated gadget discovery in the quantum domain |
topic | reinforcement learning machine learning sequence mining quantum optics quantum information |
url | https://doi.org/10.1088/2632-2153/acf098 |
work_keys_str_mv | AT leamtrenkwalder automatedgadgetdiscoveryinthequantumdomain AT andrealopezincera automatedgadgetdiscoveryinthequantumdomain AT hendrikpoulsennautrup automatedgadgetdiscoveryinthequantumdomain AT fulvioflamini automatedgadgetdiscoveryinthequantumdomain AT hansjbriegel automatedgadgetdiscoveryinthequantumdomain |