Matthew Jackson and Jarek Liesen (Oxford) - A Clean Slate for Offline RL
Автор: RL and Agents Reading Group
Загружено: 2026-03-10
Просмотров: 22
Описание:
RL & Agents Reading Group | 9 January 2026
Speakers: Matthew Jackson and Jarek Liesen
Title: A Clean Slate for Offline RL
Abstract:
Despite years of research in offline reinforcement learning (RL), the field has failed to deliver major breakthroughs in its core problem settings. This stagnation is not due to inadequate algorithms, but rather to a failure to rigorously define what constitutes offline RL. Although offline RL explicitly forbids interaction with the environment, much prior work relies on extensive, undocumented online evaluation for hyperparameter tuning, making it impossible to compare method or determine the state-of-the-art.
In this project, we aim to enable impactful and reproducible research in offline RL. We introduce a transparent and robust evaluation protocol, reimplement a wide range of prior methods in end-to-end JAX, and unify their key components into a Rainbow-style algorithm called Unifloral. Using Unifloral, we conduct a comprehensive reevaluation of existing methods and propose two new state-of-the-art approaches for model-free and model-based offline RL. By publicly releasing our implementation, we make it straightforward to reproduce, evaluate, and extend offline RL methods, making it simple to discover new algorithms.
Links:
ArXiv: https://arxiv.org/abs/2504.11453
Github: https://github.com/EmptyJackson/unifl...
Matthew's Bio:
Matthew Jackson is a graduating PhD student in the FLAIR and WhiRL labs at Oxford, interested in video world models and RL as a path to general-purpose robotics. He has worked on the Genie team at Google DeepMind and the GAIA team at Wayve, as well as publishing research in diffusion, video models, and offline and meta RL.
Jarek’s bio:
Jarek Liesen is a second-year PhD student in the FLAIR group at Oxford focusing on scalable reinforcement learning. He is the author of Rejax, a hardware-accelerated reinforcement learning library in pure JAX, and a co-author of A Clean Slate for Offline Reinforcement Learning, which introduces rigorous evaluation protocols and the Unifloral offline RL library.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: