Summary
This ERC project pushes the boundary of reliable data-driven decision making in cyber-physical systems (CPS), by bridging reinforcement learning (RL), nonparametric estimation and robust optimization. RL is a powerful abstraction of decision making under uncertainty and has witnessed dramatic recent breakthroughs. Most of these successes have been in games such as Go - well specified, closed environments that - given enough computing power - can be extensively simulated and explored. In real-world CPS, however, accurate simulations are rarely available, and exploration in these applications is a highly dangerous proposition.
We strive to rethink Reinforcement Learning from the perspective of reliability and robustness required by real-world applications. We build on our recent breakthrough result on safe Bayesian optimization (SAFE-OPT): The approach allows - for the first time - to identify provably near-optimal policies in episodic RL tasks, while guaranteeing under some regularity assumptions that with high probability no unsafe states are visited - even if the set of safe parameter values is a priori unknown.
While extremely promising, this result has several fundamental limitations, which we seek to overcome in this ERC project. To this end we will (1) go beyond low-dimensional Gaussian process models and towards much richer deep Bayesian models; (2) go beyond episodic tasks, by explicitly reasoning about the dynamics and employing ideas from robust control theory and (3) tackle bootstrapping of safe initial policies by bridging simulations and real-world experiments via multi-fidelity Bayesian optimization, and by pursuing safe active imitation learning.
Our research is motivated by three real-world CPS applications, which we pursue in interdisciplinary collaboration: Safe exploration of and with robotic platforms; tuning the energy efficiency of photovoltaic powerplants and safely optimizing the performance of a Free Electron Laser.
We strive to rethink Reinforcement Learning from the perspective of reliability and robustness required by real-world applications. We build on our recent breakthrough result on safe Bayesian optimization (SAFE-OPT): The approach allows - for the first time - to identify provably near-optimal policies in episodic RL tasks, while guaranteeing under some regularity assumptions that with high probability no unsafe states are visited - even if the set of safe parameter values is a priori unknown.
While extremely promising, this result has several fundamental limitations, which we seek to overcome in this ERC project. To this end we will (1) go beyond low-dimensional Gaussian process models and towards much richer deep Bayesian models; (2) go beyond episodic tasks, by explicitly reasoning about the dynamics and employing ideas from robust control theory and (3) tackle bootstrapping of safe initial policies by bridging simulations and real-world experiments via multi-fidelity Bayesian optimization, and by pursuing safe active imitation learning.
Our research is motivated by three real-world CPS applications, which we pursue in interdisciplinary collaboration: Safe exploration of and with robotic platforms; tuning the energy efficiency of photovoltaic powerplants and safely optimizing the performance of a Free Electron Laser.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: | https://cordis.europa.eu/project/id/815943 |
Start date: | 01-01-2019 |
End date: | 30-06-2024 |
Total budget - Public funding: | 1 996 500,00 Euro - 1 996 500,00 Euro |
Cordis data
Original description
This ERC project pushes the boundary of reliable data-driven decision making in cyber-physical systems (CPS), by bridging reinforcement learning (RL), nonparametric estimation and robust optimization. RL is a powerful abstraction of decision making under uncertainty and has witnessed dramatic recent breakthroughs. Most of these successes have been in games such as Go - well specified, closed environments that - given enough computing power - can be extensively simulated and explored. In real-world CPS, however, accurate simulations are rarely available, and exploration in these applications is a highly dangerous proposition.We strive to rethink Reinforcement Learning from the perspective of reliability and robustness required by real-world applications. We build on our recent breakthrough result on safe Bayesian optimization (SAFE-OPT): The approach allows - for the first time - to identify provably near-optimal policies in episodic RL tasks, while guaranteeing under some regularity assumptions that with high probability no unsafe states are visited - even if the set of safe parameter values is a priori unknown.
While extremely promising, this result has several fundamental limitations, which we seek to overcome in this ERC project. To this end we will (1) go beyond low-dimensional Gaussian process models and towards much richer deep Bayesian models; (2) go beyond episodic tasks, by explicitly reasoning about the dynamics and employing ideas from robust control theory and (3) tackle bootstrapping of safe initial policies by bridging simulations and real-world experiments via multi-fidelity Bayesian optimization, and by pursuing safe active imitation learning.
Our research is motivated by three real-world CPS applications, which we pursue in interdisciplinary collaboration: Safe exploration of and with robotic platforms; tuning the energy efficiency of photovoltaic powerplants and safely optimizing the performance of a Free Electron Laser.
Status
SIGNEDCall topic
ERC-2018-COGUpdate Date
27-04-2024
Images
No images available.
Geographical location(s)