Stefano V. Albrecht - From Deep Reinforcement Learning to LLM-based Agents

Автор: UCL DARK

Загружено: 2024-05-01

Просмотров: 2689

Описание: Invited talk by Stefano V. Albrecht on April 29, 2024 at UCL DARK.

Abstract:
Since the recent successes of large language models (LLMs), we are beginning to see a shift of attention from deep reinforcement learning to LLM-based agents. While deep RL policies are typically learned from scratch to maximise some defined return objective, LLM-agents use an existing LLM at their core and focus on clever prompt engineering and downstream specialisation of the LLM via supervised and reinforcement learning techniques. In this talk, I will first provide a broad overview of my group’s research in deep RL, which focuses among other topics on developing sample-efficient and robust RL algorithms for both single- and multi-agent control tasks, including industry applications in autonomous driving and multi-robot warehouses. I will then present our recent research into LLM-agents, where we propose an approach for household robotics that takes into account user preferences to achieve more robust and effective planning. I will conclude with some personal observations about the state of LLM-agent research: (a) many papers in this field follow essentially the same recipe by focussing on prompt engineering and downstream specialisation; (b) this recipe makes their scientific claims brittle as they depend crucially on the specific LMM engine, and (c) LLMs are not natively designed to maximise objectives for optimal control and decision making. Based on these observations, I believe some fruitful research avenues can be identified.
Bio:
Dr. Stefano V. Albrecht is Associate Professor in Artificial Intelligence in the School of Informatics, University of Edinburgh. He leads the Autonomous Agents Research Group (https://agents.inf.ed.ac.uk) which specialises in developing machine learning algorithms for autonomous systems control and decision making, with a particular focus on reinforcement learning and multi-agent interaction. In his roles as Royal Academy of Engineering and Royal Society Industrial Fellow, he actively develops industry applications in the areas of multi-robot warehouses with Dematic/KION, and autonomous driving with Five AI which completed one of the most extensive urban road trials of autonomous driving in London before being acquired by Bosch in 2022. Dr. Albrecht is affiliated with the Alan Turing Institute where he leads the Multi-Agent Systems theme. In 2022, he was nominated for the IJCAI Computers and Thought Award based on his research which introduced Stochastic Bayesian Games and optimal solution algorithms, which have since been applied in a range of domains. Previously, Dr. Albrecht was a postdoctoral fellow at the University of Texas at Austin working with Prof. Peter Stone. He obtained PhD and MSc degrees in Artificial Intelligence from the University of Edinburgh, and a BSc degree in Computer Science from Technical University of Darmstadt. He is co-author of the new MIT Press textbook "Multi-Agent Reinforcement Learning: Foundations and Modern Approaches" which is freely available at www.marl-book.com.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Stefano V. Albrecht - From Deep Reinforcement Learning to LLM-based Agents

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Louis Kirsch - Towards Automating ML Research with general-purpose meta-learners @ UCL DARK

Louis Kirsch - Towards Automating ML Research with general-purpose meta-learners @ UCL DARK

Jak DROGÓWKA naciąga Polaków! (I jak legalnie się bronić)

Jak DROGÓWKA naciąga Polaków! (I jak legalnie się bronić)

Sebastian Borgeaud - Efficient Training of Large Language Models @ UCL DARK

Sebastian Borgeaud - Efficient Training of Large Language Models @ UCL DARK

DeepMind Open-Endedness Team - Genie: Generative Interactive Environments

DeepMind Open-Endedness Team - Genie: Generative Interactive Environments

Что такое генеративный ИИ и как он работает? – Лекции Тьюринга с Миреллой Лапатой

Что такое генеративный ИИ и как он работает? – Лекции Тьюринга с Миреллой Лапатой

Kenneth O. Stanley - Novel Opportunities in Open-Endedness @ UCL DARK

Kenneth O. Stanley - Novel Opportunities in Open-Endedness @ UCL DARK

DD-FEM: A Physics-Governed Path to Foundation Models | Youngsoo Choi | JHU-IITD SMaRT

DD-FEM: A Physics-Governed Path to Foundation Models | Youngsoo Choi | JHU-IITD SMaRT

Martin Klissarov - MaestroMotif: Skill Design from AI Feedback

Martin Klissarov - MaestroMotif: Skill Design from AI Feedback

Anssi Kanervisto - After 8 years, Minecraft continues to push the frontier of AI

Anssi Kanervisto - After 8 years, Minecraft continues to push the frontier of AI

Franziska Meier - Lifelong Learning for Robotics @ UCL DARK

Franziska Meier - Lifelong Learning for Robotics @ UCL DARK

UCL DARK Invited Speaker Series

UCL DARK Invited Speaker Series

The genius behind some of the world's most famous buildings | Renzo Piano

The genius behind some of the world's most famous buildings | Renzo Piano

Micah Carroll - Uni[MASK]: Unified Inference in Sequential Decision Problems @ UCL DARK

Micah Carroll - Uni[MASK]: Unified Inference in Sequential Decision Problems @ UCL DARK

I Just Turned 44, and I’m Done.

I Just Turned 44, and I’m Done.

Nathan Lambert - Reinforcement Learning from Human Feedback @ UCL DARK

Nathan Lambert - Reinforcement Learning from Human Feedback @ UCL DARK

Alastair Crooke: 'Trump is caught in a FATAL trap' | Ep. 9

Alastair Crooke: 'Trump is caught in a FATAL trap' | Ep. 9

Thomas Kipf - Learning Structured Models of the World @ UCL DARK

Thomas Kipf - Learning Structured Models of the World @ UCL DARK

Why I am an anarchist: insights into British anarchist thought and politics | LSE Event

Why I am an anarchist: insights into British anarchist thought and politics | LSE Event

Linguistics, Style and Writing in the 21st Century - with Steven Pinker

Linguistics, Style and Writing in the 21st Century - with Steven Pinker

Bloomberg London: Workplace of the Future

Bloomberg London: Workplace of the Future