David Abel - A Definition of Continual Reinforcement Learning

Автор: RL and Agents Reading Group

Загружено: 2024-05-20

Просмотров: 1019

Описание: UoE RL Reading Group | 2 May 2024

Speaker: David Abel (Google DeepMind)

Title: A Definition of Continual Reinforcement Learning

Authors: David Abel, André Barreto, Benjamin Van Roy, Doina Precup, Hado van Hasselt, Satinder Singh

Abstract: In a standard view of the reinforcement learning problem, an agent’s goal is to efficiently identify a policy that maximizes long-term reward. However, this perspective is based on a restricted view of learning as finding a solution, rather than treating learning as endless adaptation. In contrast, continual reinforcement learning refers to the setting in which the best agents never stop learning. Despite the importance of continual reinforcement learning, the community lacks a simple definition of the problem that highlights its commitments and makes its primary concepts precise and clear. To this end, this paper is dedicated to carefully defining the continual reinforcement learning problem. We formalize the notion of agents that “never stop learning” through a new mathematical language for analyzing and cataloging agents. Using this new language, we define a continual learning agent as one that can be understood as carrying out an implicit search process indefinitely, and continual reinforcement learning as the setting in which the best agents are all continual learning agents.

Link: https://arxiv.org/abs/2307.11046

Bio: David Abel is a Senior Research Scientist at DeepMind in the UK, based in Edinburgh. Before that, he completed his Ph.D in Computer Science at Brown University.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

David Abel - A Definition of Continual Reinforcement Learning

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Adam White - Empirical Design in Reinforcement Learning

Adam White - Empirical Design in Reinforcement Learning

Reinforcement Learning: Essential Concepts

Reinforcement Learning: Essential Concepts

Оффлайн обучение с подкреплением

Оффлайн обучение с подкреплением

AI Seminar Series: Rethinking the Foundations for Continual Reinforcement Learning, Michael Bowling

AI Seminar Series: Rethinking the Foundations for Continual Reinforcement Learning, Michael Bowling

Лекция от легенды ИИ в Стэнфорде

Лекция от легенды ИИ в Стэнфорде

50 Best of Bach

50 Best of Bach

Станислав Белковский*: Персонально ваш / 12.03.26 @BelkovskiyS

Станислав Белковский*: Персонально ваш / 12.03.26 @BelkovskiyS

Гипотеза Пуанкаре — Алексей Савватеев на ПостНауке

Гипотеза Пуанкаре — Алексей Савватеев на ПостНауке

КУРАЕВ: Трамп, Иран и РПЦ: что скрывается за религиозными заявлениями политиков / Главная тема

КУРАЕВ: Трамп, Иран и РПЦ: что скрывается за религиозными заявлениями политиков / Главная тема

Сутки ударов по Сочи, У Галицкого отожмут все, Протест фермеров. Подоляк, Шуманов, Давлетгильдеев

Сутки ударов по Сочи, У Галицкого отожмут все, Протест фермеров. Подоляк, Шуманов, Давлетгильдеев

4 Hours Chopin for Studying, Concentration & Relaxation

4 Hours Chopin for Studying, Concentration & Relaxation

Andreas Krause (ETH Zuerich):

Andreas Krause (ETH Zuerich): "Safe Exploration in Reinforcement Learning"

Richard Sutton on Pursuing AGI Through Reinforcement Learning

Richard Sutton on Pursuing AGI Through Reinforcement Learning

Алексей Венедиктов*. Без посредников / 11.03.26

Алексей Венедиктов*. Без посредников / 11.03.26

Adam Jelley and Eloi Alonso - Diffusion for World Modeling: Visual Details Matter in Atari (DIAMOND)

Adam Jelley and Eloi Alonso - Diffusion for World Modeling: Visual Details Matter in Atari (DIAMOND)

Музыка для работы за компьютером | Фоновая музыка для концентрации и продуктивности

Музыка для работы за компьютером | Фоновая музыка для концентрации и продуктивности

Haydn: The Complete Paris Symphonies 82-87 L' Ours, La Poule, La Reine .. (rf.rc.: Kurt Sanderling)

Haydn: The Complete Paris Symphonies 82-87 L' Ours, La Poule, La Reine .. (rf.rc.: Kurt Sanderling)

Reinforcement Learning: Crash Course AI #9

Reinforcement Learning: Crash Course AI #9

David Abel on Artificial Intelligence

David Abel on Artificial Intelligence

Музыка для работы - Deep Focus Mix для программирования, кодирования

Музыка для работы - Deep Focus Mix для программирования, кодирования