Andreas Krause - Safe and Efficient Exploration in Reinforcement Learning

日付:

2020年8月27日

著者:

Hrvoje Stojic

Safe and Efficient Exploration in Reinforcement Learning


Abstract

At the heart of Reinforcement Learning lies the challenge of trading exploration -- collecting data for identifying better models -- and exploitation -- using the estimate to make decisions. In simulated environments (e.g., games), exploration is primarily a computational concern. In real-world settings, exploration is costly, and a potentially dangerous proposition, as it requires experimenting with actions that have unknown consequences. In this talk, I will present our work towards rigorously reasoning about safety of exploration in reinforcement learning. I will discuss a model-free approach, where we seek to optimize an unknown reward function subject to unknown constraints. Both reward and constraints are revealed through noisy experiments, and safety requires that no infeasible action is chosen at any point. I will also discuss model-based approaches, where we learn about system dynamics through exploration, yet need to verify safety of the estimated policy. Our approaches use Bayesian inference over the objective, constraints and dynamics, and -- under some regularity conditions -- are guaranteed to be both safe and complete, i.e., converge to a natural notion of reachable optimum. I will also present recent results harnessing the model uncertainty for improving efficiency of exploration, and show experiments on safely and efficiently tuning cyber-physical systems in a data-driven manner.


Notes


  • Andreas Krause is a Professor of Computer Science and Director of Learning & Adaptive Systems Group at ETH Zurich. His personal website can be found here.

ソーシャルメディアで共有

ソーシャルメディアで共有

ソーシャルメディアで共有

ソーシャルメディアで共有

関連するセミナー

Mickael Binois - Leveraging replication in active learning

We were recently joined by Mickael Binois, to talk about 'Leveraging replication in active learning'.

2024/06/24

Mickael Binois - Leveraging replication in active learning

We were recently joined by Mickael Binois, to talk about 'Leveraging replication in active learning'.

2024/06/24

Mickael Binois - Leveraging replication in active learning

We were recently joined by Mickael Binois, to talk about 'Leveraging replication in active learning'.

2024/06/24

Mickael Binois - Leveraging replication in active learning

We were recently joined by Mickael Binois, to talk about 'Leveraging replication in active learning'.

2024/06/24

Ilija Bogunovic - From Data to Confident Decisions

We were recently joined by Ilija Bogunovic, to talk about 'Robust and Efficient Algorithmic Decision Making'.

2024/06/13

Ilija Bogunovic - From Data to Confident Decisions

We were recently joined by Ilija Bogunovic, to talk about 'Robust and Efficient Algorithmic Decision Making'.

2024/06/13

Ilija Bogunovic - From Data to Confident Decisions

We were recently joined by Ilija Bogunovic, to talk about 'Robust and Efficient Algorithmic Decision Making'.

2024/06/13

Ilija Bogunovic - From Data to Confident Decisions

We were recently joined by Ilija Bogunovic, to talk about 'Robust and Efficient Algorithmic Decision Making'.

2024/06/13

Dario Azzimonti - Preference learning with Gaussian processes

We were recently joined by Dario Azzimonti, to talk about 'Preference learning with Gaussian processes'.

2024/05/23

Dario Azzimonti - Preference learning with Gaussian processes

We were recently joined by Dario Azzimonti, to talk about 'Preference learning with Gaussian processes'.

2024/05/23

Dario Azzimonti - Preference learning with Gaussian processes

We were recently joined by Dario Azzimonti, to talk about 'Preference learning with Gaussian processes'.

2024/05/23

Dario Azzimonti - Preference learning with Gaussian processes

We were recently joined by Dario Azzimonti, to talk about 'Preference learning with Gaussian processes'.

2024/05/23

Mojmír Mutný - Optimal Experiment Design in Markov Chains

We were recently joined by Mojmír Mutný (ETH Zurich), to talk about 'Optimal Experiment Design in Markov Chains'.

2024/03/28

Mojmír Mutný - Optimal Experiment Design in Markov Chains

We were recently joined by Mojmír Mutný (ETH Zurich), to talk about 'Optimal Experiment Design in Markov Chains'.

2024/03/28

Mojmír Mutný - Optimal Experiment Design in Markov Chains

We were recently joined by Mojmír Mutný (ETH Zurich), to talk about 'Optimal Experiment Design in Markov Chains'.

2024/03/28

Mojmír Mutný - Optimal Experiment Design in Markov Chains

We were recently joined by Mojmír Mutný (ETH Zurich), to talk about 'Optimal Experiment Design in Markov Chains'.

2024/03/28