Lecture 1, 2025, Course overview: RL and DP, AlphaZero, deterministic DP, examples, applications

Автор: Dimitri Bertsekas

Загружено: 2025-01-16

Просмотров: 7110

Описание: Slides, class notes, and related textbook material at https://web.mit.edu/dimitrib/www/RLbo...
This site also contains complete PDF of related textbooks by Bertsekas:
"A Course in Reinforcement Learning", 2nd edition, 2025
"Lessons from AlphaZero for Optimal, Model Predictive, and Adaptive Control," 2022
"Abstract Dynamic Programming", 3rd edition, 2022
"Rollout, Policy Iteration, and Distributed Reinforcement Learning," 2020
Lecture Content: Course overview, AlphaZero, off-line training, on-line play, relation to Newton's method. Exact and approximate dynamic programming for deterministic problems, discrete optimization, model predictive and adaptive control, large language models via dynamic programming, approximation in value space and reinforcement learning

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Lecture 1, 2025, Course overview: RL and DP, AlphaZero, deterministic DP, examples, applications

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Lecture 2, 2025, Stochastic finite and infinite horizon DP, approximation in value and policy space

Lecture 2, 2025, Stochastic finite and infinite horizon DP, approximation in value and policy space

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Reinforcement Learning, Model Predictive Control, and the Newton Step for Solving Bellman's Equation

Reinforcement Learning, Model Predictive Control, and the Newton Step for Solving Bellman's Equation

Lecture 1, 2024, course overview: RL and DP, AlphaZero, discrete and continuous applications

Lecture 1, 2024, course overview: RL and DP, AlphaZero, discrete and continuous applications

Reinforcement Learning By the Book

Reinforcement Learning By the Book

LIDS@80: Honoring Dimitri Bertsekas

LIDS@80: Honoring Dimitri Bertsekas

Plenary lecture at IFAC Nonlinear MPC, 2024; Model Predictive Control and Reinforcement Learning

Plenary lecture at IFAC Nonlinear MPC, 2024; Model Predictive Control and Reinforcement Learning

Stanford CS231N Deep Learning for Computer Vision | Spring 2025 | Lecture 1: Introduction

Stanford CS231N Deep Learning for Computer Vision | Spring 2025 | Lecture 1: Introduction

Abstract Dynamic Programming, Reinforcement Learning, Newton's Method, and Gradient Optimization

Abstract Dynamic Programming, Reinforcement Learning, Newton's Method, and Gradient Optimization

Арестович: Блиц вопрос ответ. Европа - кто следующий? #украина #арестович #шелест

Арестович: Блиц вопрос ответ. Европа - кто следующий? #украина #арестович #шелест

Все, что вам нужно знать о теории управления

Все, что вам нужно знать о теории управления

Lec 01. Introduction to Deep Learning

Lec 01. Introduction to Deep Learning

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 1: Overview and Tokenization

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 1: Overview and Tokenization

Lecture 1: Introduction to Information Theory

Lecture 1: Introduction to Information Theory

Лучший документальный фильм про создание ИИ

Лучший документальный фильм про создание ИИ

Математическая тревожность, нейросети, задачи тысячелетия / Андрей Коняев

Математическая тревожность, нейросети, задачи тысячелетия / Андрей Коняев

The Elegant Math Behind Machine Learning

The Elegant Math Behind Machine Learning

L1 MDPs, Exact Solution Methods, Max-ent RL (Foundations of Deep RL Series)

L1 MDPs, Exact Solution Methods, Max-ent RL (Foundations of Deep RL Series)

The failure of theoretical error bounds in Reinforcement Learning.

The failure of theoretical error bounds in Reinforcement Learning.

Лекция от легенды ИИ в Стэнфорде

Лекция от легенды ИИ в Стэнфорде