Reinforcement Learning Fundamentals - Part 2 - Actor Critic Models (A2C)
Автор: John Olafenwa
Загружено: 2026-01-12
Просмотров: 11
Описание:
RL with actor critic methods. In this video, I explained the challenges with policy gradient methods using full returns and introduced value estimation, advantage functions and actor critic methods.
This is part 2 of a series that will conclude in running RL on LLMs.
You can find code for this part here: https://github.com/johnolafenwa/agi-p...
And slides here: https://docs.google.com/presentation/...
Contents
00:00:00 Intro
00:00:48 Recap of RL101
00:08:53 The Variance Problem
00:15:12 Advantage Functions
00:28:33 Code Implementation of A2C
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: