Andreas Krause - Safe and Efficient Exploration in Reinforcement Learning

Date:

August 27, 2020

Author:

Hrvoje Stojic

Safe and Efficient Exploration in Reinforcement Learning


Abstract

At the heart of Reinforcement Learning lies the challenge of trading exploration -- collecting data for identifying better models -- and exploitation -- using the estimate to make decisions. In simulated environments (e.g., games), exploration is primarily a computational concern. In real-world settings, exploration is costly, and a potentially dangerous proposition, as it requires experimenting with actions that have unknown consequences. In this talk, I will present our work towards rigorously reasoning about safety of exploration in reinforcement learning. I will discuss a model-free approach, where we seek to optimize an unknown reward function subject to unknown constraints. Both reward and constraints are revealed through noisy experiments, and safety requires that no infeasible action is chosen at any point. I will also discuss model-based approaches, where we learn about system dynamics through exploration, yet need to verify safety of the estimated policy. Our approaches use Bayesian inference over the objective, constraints and dynamics, and -- under some regularity conditions -- are guaranteed to be both safe and complete, i.e., converge to a natural notion of reachable optimum. I will also present recent results harnessing the model uncertainty for improving efficiency of exploration, and show experiments on safely and efficiently tuning cyber-physical systems in a data-driven manner.


Notes


  • Andreas Krause is a Professor of Computer Science and Director of Learning & Adaptive Systems Group at ETH Zurich. His personal website can be found here.

Share on social media

Share on social media

Share on social media

Share on social media

Related Seminars

Mickael Binois - Leveraging replication in active learning

We were recently joined by Mickael Binois, to talk about 'Leveraging replication in active learning'.

Jun 24, 2024

Mickael Binois - Leveraging replication in active learning

We were recently joined by Mickael Binois, to talk about 'Leveraging replication in active learning'.

Jun 24, 2024

Mickael Binois - Leveraging replication in active learning

We were recently joined by Mickael Binois, to talk about 'Leveraging replication in active learning'.

Jun 24, 2024

Mickael Binois - Leveraging replication in active learning

We were recently joined by Mickael Binois, to talk about 'Leveraging replication in active learning'.

Jun 24, 2024

Ilija Bogunovic - From Data to Confident Decisions

We were recently joined by Ilija Bogunovic, to talk about 'Robust and Efficient Algorithmic Decision Making'.

Jun 13, 2024

Ilija Bogunovic - From Data to Confident Decisions

We were recently joined by Ilija Bogunovic, to talk about 'Robust and Efficient Algorithmic Decision Making'.

Jun 13, 2024

Ilija Bogunovic - From Data to Confident Decisions

We were recently joined by Ilija Bogunovic, to talk about 'Robust and Efficient Algorithmic Decision Making'.

Jun 13, 2024

Ilija Bogunovic - From Data to Confident Decisions

We were recently joined by Ilija Bogunovic, to talk about 'Robust and Efficient Algorithmic Decision Making'.

Jun 13, 2024

Dario Azzimonti - Preference learning with Gaussian processes

We were recently joined by Dario Azzimonti, to talk about 'Preference learning with Gaussian processes'.

May 23, 2024

Dario Azzimonti - Preference learning with Gaussian processes

We were recently joined by Dario Azzimonti, to talk about 'Preference learning with Gaussian processes'.

May 23, 2024

Dario Azzimonti - Preference learning with Gaussian processes

We were recently joined by Dario Azzimonti, to talk about 'Preference learning with Gaussian processes'.

May 23, 2024

Dario Azzimonti - Preference learning with Gaussian processes

We were recently joined by Dario Azzimonti, to talk about 'Preference learning with Gaussian processes'.

May 23, 2024

Mojmír Mutný - Optimal Experiment Design in Markov Chains

We were recently joined by Mojmír Mutný (ETH Zurich), to talk about 'Optimal Experiment Design in Markov Chains'.

Mar 28, 2024

Mojmír Mutný - Optimal Experiment Design in Markov Chains

We were recently joined by Mojmír Mutný (ETH Zurich), to talk about 'Optimal Experiment Design in Markov Chains'.

Mar 28, 2024

Mojmír Mutný - Optimal Experiment Design in Markov Chains

We were recently joined by Mojmír Mutný (ETH Zurich), to talk about 'Optimal Experiment Design in Markov Chains'.

Mar 28, 2024

Mojmír Mutný - Optimal Experiment Design in Markov Chains

We were recently joined by Mojmír Mutný (ETH Zurich), to talk about 'Optimal Experiment Design in Markov Chains'.

Mar 28, 2024