Safe Optimal Control of Dynamic Systems: Learning from Experts and Safely Exploring New Policies

Many real-life systems are usually controlled through policies replicating experts’ knowledge, typically favouring “safety” at the expense of optimality. Indeed, these control policies are usually aimed at avoiding a system’s disruptions or deviations from a target behaviour, leading to suboptimal p...

Full description

Bibliographic Details
Main Authors:	Antonio Candelieri, Andrea Ponti, Elisabetta Fersini, Enza Messina, Francesco Archetti
Format:	Article
Language:	English
Published:	MDPI AG 2023-10-01
Series:	Mathematics
Subjects:	optimal control safe exploration Gaussian Processes
Online Access:	https://www.mdpi.com/2227-7390/11/20/4347

_version_	1797573081321963520
author	Antonio Candelieri Andrea Ponti Elisabetta Fersini Enza Messina Francesco Archetti
author_facet	Antonio Candelieri Andrea Ponti Elisabetta Fersini Enza Messina Francesco Archetti
author_sort	Antonio Candelieri
collection	DOAJ
description	Many real-life systems are usually controlled through policies replicating experts’ knowledge, typically favouring “safety” at the expense of optimality. Indeed, these control policies are usually aimed at avoiding a system’s disruptions or deviations from a target behaviour, leading to suboptimal performances. This paper proposes a statistical learning approach to exploit the historical safe experience—collected through the application of a safe control policy based on experts’ knowledge— to “safely explore” new and more efficient policies. The basic idea is that performances can be improved by facing a reasonable and quantifiable risk in terms of safety. The proposed approach relies on Gaussian Process regression to obtain a probabilistic model of both a system’s dynamics and performances, depending on the historical safe experience. The new policy consists of solving a constrained optimization problem, with two Gaussian Processes modelling, respectively, the safety constraints and the performance metric (i.e., objective function). As a probabilistic model, Gaussian Process regression provides an estimate of the target variable and the associated uncertainty; this property is crucial for dealing with uncertainty while new policies are safely explored. Another important benefit is that the proposed approach does not require any implementation of an expensive digital twin of the original system. Results on two real-life systems are presented, empirically proving the ability of the approach to improve performances with respect to the initial safe policy without significantly affecting safety.
first_indexed	2024-03-10T21:04:36Z
format	Article
id	doaj.art-4be0a83a5c5f40b2b6d64e567a63a159
institution	Directory Open Access Journal
issn	2227-7390
language	English
last_indexed	2024-03-10T21:04:36Z
publishDate	2023-10-01
publisher	MDPI AG
record_format	Article
series	Mathematics
spelling	doaj.art-4be0a83a5c5f40b2b6d64e567a63a1592023-11-19T17:14:44ZengMDPI AGMathematics2227-73902023-10-011120434710.3390/math11204347Safe Optimal Control of Dynamic Systems: Learning from Experts and Safely Exploring New PoliciesAntonio Candelieri0Andrea Ponti1Elisabetta Fersini2Enza Messina3Francesco Archetti4Department of Economics Management and Statistics, University of Milano-Bicocca, 20126 Milan, ItalyDepartment of Economics Management and Statistics, University of Milano-Bicocca, 20126 Milan, ItalyDepartment of Computer Science Systems and Communication, University of Milano-Bicocca, 20126 Milan, ItalyDepartment of Computer Science Systems and Communication, University of Milano-Bicocca, 20126 Milan, ItalyDepartment of Computer Science Systems and Communication, University of Milano-Bicocca, 20126 Milan, ItalyMany real-life systems are usually controlled through policies replicating experts’ knowledge, typically favouring “safety” at the expense of optimality. Indeed, these control policies are usually aimed at avoiding a system’s disruptions or deviations from a target behaviour, leading to suboptimal performances. This paper proposes a statistical learning approach to exploit the historical safe experience—collected through the application of a safe control policy based on experts’ knowledge— to “safely explore” new and more efficient policies. The basic idea is that performances can be improved by facing a reasonable and quantifiable risk in terms of safety. The proposed approach relies on Gaussian Process regression to obtain a probabilistic model of both a system’s dynamics and performances, depending on the historical safe experience. The new policy consists of solving a constrained optimization problem, with two Gaussian Processes modelling, respectively, the safety constraints and the performance metric (i.e., objective function). As a probabilistic model, Gaussian Process regression provides an estimate of the target variable and the associated uncertainty; this property is crucial for dealing with uncertainty while new policies are safely explored. Another important benefit is that the proposed approach does not require any implementation of an expensive digital twin of the original system. Results on two real-life systems are presented, empirically proving the ability of the approach to improve performances with respect to the initial safe policy without significantly affecting safety.https://www.mdpi.com/2227-7390/11/20/4347optimal controlsafe explorationGaussian Processes
spellingShingle	Antonio Candelieri Andrea Ponti Elisabetta Fersini Enza Messina Francesco Archetti Safe Optimal Control of Dynamic Systems: Learning from Experts and Safely Exploring New Policies Mathematics optimal control safe exploration Gaussian Processes
title	Safe Optimal Control of Dynamic Systems: Learning from Experts and Safely Exploring New Policies
title_full	Safe Optimal Control of Dynamic Systems: Learning from Experts and Safely Exploring New Policies
title_fullStr	Safe Optimal Control of Dynamic Systems: Learning from Experts and Safely Exploring New Policies
title_full_unstemmed	Safe Optimal Control of Dynamic Systems: Learning from Experts and Safely Exploring New Policies
title_short	Safe Optimal Control of Dynamic Systems: Learning from Experts and Safely Exploring New Policies
title_sort	safe optimal control of dynamic systems learning from experts and safely exploring new policies
topic	optimal control safe exploration Gaussian Processes
url	https://www.mdpi.com/2227-7390/11/20/4347
work_keys_str_mv	AT antoniocandelieri safeoptimalcontrolofdynamicsystemslearningfromexpertsandsafelyexploringnewpolicies AT andreaponti safeoptimalcontrolofdynamicsystemslearningfromexpertsandsafelyexploringnewpolicies AT elisabettafersini safeoptimalcontrolofdynamicsystemslearningfromexpertsandsafelyexploringnewpolicies AT enzamessina safeoptimalcontrolofdynamicsystemslearningfromexpertsandsafelyexploringnewpolicies AT francescoarchetti safeoptimalcontrolofdynamicsystemslearningfromexpertsandsafelyexploringnewpolicies

Safe Optimal Control of Dynamic Systems: Learning from Experts and Safely Exploring New Policies

Similar Items