TD-MPC Explained, With Alexander Soare (Part 2 of 2)
Автор: HuggingFace
Загружено: 2024-10-24
Просмотров: 1196
Описание:
In this video I explain how we train the neural networks of TD-MPC.
TD-MPC paper: https://arxiv.org/abs/2203.04955
FOWM paper (this is what's behind the implementation in the LeRobot library): https://arxiv.org/abs/2310.16029
LeRobot code: https://github.com/huggingface/lerobo...
Many thanks to Nicklas Hansen et. al. for publishing their research and open sourcing their code.
Chapters:
0:00 - Listing the neural networks we need to train
04:53 - What a training batch item looks like
06:09 - Forward passes and losses
13:41 - Why the latent state representation does not collapse
14:24 - Understanding TD Learning
23:42 - TD learning intuition in real experiments
26:58 - Optimizing the Q network using the TD error
30:34 - Offline vs online data collection and training loop
36:20 - Wrapping up
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: