Lecture 1, 2024, course overview: RL and DP, AlphaZero, discrete and continuous applications

Автор: Dimitri Bertsekas

Загружено: 2024-04-27

Просмотров: 5148

Описание: Slides, class notes, and related textbook material at http://web.mit.edu/dimitrib/www/RLboo...

The sound of the 1st videolecture of the 2024 class turned out to be degraded. I have instead posted the 1st video of the 2023 class, which has better sound and essentially identical content. Slides can be found at https://web.mit.edu/dimitrib/www/RLTo...

The subsequent videolectures 2-13 are from the 2024 offering of the course. The slides of the 1st lecture of 2024 can be found at https://web.mit.edu/dimitrib/www/RLTo...

Lecture Content: Course overview, AlphaZero, off-line training, on-line play, relation to Newton's method. Exact and approximate dynamic programming for deterministic problems, discrete optimization, model predictive and adaptive control, large language models via dynamic programming, approximation in value space and reinforcement learning

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Lecture 1, 2024, course overview: RL and DP, AlphaZero, discrete and continuous applications

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Lecture 1, 2025, Course overview: RL and DP, AlphaZero, deterministic DP, examples, applications

Lecture 1, 2025, Course overview: RL and DP, AlphaZero, deterministic DP, examples, applications

Lecture 12, 2025; Training of cost functions, approximation in policy space, policy gradient methods

Lecture 12, 2025; Training of cost functions, approximation in policy space, policy gradient methods

Lecture 2, 2025, Stochastic finite and infinite horizon DP, approximation in value and policy space

Lecture 2, 2025, Stochastic finite and infinite horizon DP, approximation in value and policy space

Plenary lecture at IFAC Nonlinear MPC, 2024; Model Predictive Control and Reinforcement Learning

Plenary lecture at IFAC Nonlinear MPC, 2024; Model Predictive Control and Reinforcement Learning

Marcus Hutter | Universal Artificial Intelligence and Solomonoff Induction | The Cartesian Cafe

Marcus Hutter | Universal Artificial Intelligence and Solomonoff Induction | The Cartesian Cafe

Путин пошёл на крайние меры / Срочное обращение к силовикам

Путин пошёл на крайние меры / Срочное обращение к силовикам

Reinforcement Learning, Model Predictive Control, and the Newton Step for Solving Bellman's Equation

Reinforcement Learning, Model Predictive Control, and the Newton Step for Solving Bellman's Equation

Lecture 1, 2021. Overview. AlphaZero, DP, policy iteration. ASU

Lecture 1, 2021. Overview. AlphaZero, DP, policy iteration. ASU

Lecture 4, 2025, POMDP, Systems with Changing Parameters, Adaptive Control, Model Predictive Control

Lecture 4, 2025, POMDP, Systems with Changing Parameters, Adaptive Control, Model Predictive Control

Terence Tao at IMO 2024: AI and Mathematics

Terence Tao at IMO 2024: AI and Mathematics

Лекция от легенды ИИ в Стэнфорде

Лекция от легенды ИИ в Стэнфорде

This is why Deep Learning is really weird.

This is why Deep Learning is really weird.

Lecture 1, 2023: Introduction, AlphaZero, Deterministic DP, course overview, ASU

Lecture 1, 2023: Introduction, AlphaZero, Deterministic DP, course overview, ASU

Distributed Optimization via Alternating Direction Method of Multipliers

Distributed Optimization via Alternating Direction Method of Multipliers

The Full Reinforcement Learning Iceberg

The Full Reinforcement Learning Iceberg

Математическая тревожность, нейросети, задачи тысячелетия / Андрей Коняев

Математическая тревожность, нейросети, задачи тысячелетия / Андрей Коняев

Сеть Хопфилда: как хранятся воспоминания в нейронных сетях? [Нобелевская премия по физике 2024 го...

Сеть Хопфилда: как хранятся воспоминания в нейронных сетях? [Нобелевская премия по физике 2024 го...

Abstract Dynamic Programming, Reinforcement Learning, Newton's Method, and Gradient Optimization

Abstract Dynamic Programming, Reinforcement Learning, Newton's Method, and Gradient Optimization

30 самых прекрасных классических произведений для души и сердца 🎵 Моцарт, Бах, Бетховен, Шопен

30 самых прекрасных классических произведений для души и сердца 🎵 Моцарт, Бах, Бетховен, Шопен

Deep House Mix 2024 | Deep House, Vocal House, Nu Disco, Chillout Mix by Diamond #3

Deep House Mix 2024 | Deep House, Vocal House, Nu Disco, Chillout Mix by Diamond #3