Expected Return - What Drives a Reinforcement Learning Agent in an MDP

Автор: deeplizard

Загружено: 2018-09-22

Просмотров: 91200

Описание: 💡Enroll to gain access to the full course:
https://deeplizard.com/course/rlcpailzrd

Welcome back to this series on reinforcement learning! In this video, we're going to build on the way we think about the cumulative rewards that an agent receives in a Markov decision process and introduce the important concept of return.

We'll see that the return is exactly what's driving the agent to make the decisions it makes. We'll also introduce the idea of episodes and talk about episodic tasks vs. continuing tasks.

Sources:
Reinforcement Learning: An Introduction, Second Edition by Richard S. Sutton and Andrew G. Bartow
http://incompleteideas.net/book/RLboo...

Playing Atari with Deep Reinforcement Learning by Deep Mind Technologies
https://www.cs.toronto.edu/~vmnih/doc...

🕒🦎 VIDEO SECTIONS 🦎🕒

00:00 Welcome to DEEPLIZARD - Go to deeplizard.com for learning resources
00:30 Help deeplizard add video timestamps - See example in the description
06:18 Collective Intelligence and the DEEPLIZARD HIVEMIND

💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥

👋 Hey, we're Chris and Mandy, the creators of deeplizard!

👉 Check out the website for more learning material:
🔗 https://deeplizard.com

💻 ENROLL TO GET DOWNLOAD ACCESS TO CODE FILES
🔗 https://deeplizard.com/resources

🧠 Support collective intelligence, join the deeplizard hivemind:
🔗 https://deeplizard.com/hivemind

🧠 Use code DEEPLIZARD at checkout to receive 15% off your first Neurohacker order
👉 Use your receipt from Neurohacker to get a discount on deeplizard courses
🔗 https://neurohacker.com/shop?rfsn=648...

👀 CHECK OUT OUR VLOG:
🔗    / deeplizardvlog

❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind:
Tammy
Mano Prime
Ling Li

🚀 Boost collective intelligence by sharing this video on social media!

👀 Follow deeplizard:
Our vlog:    / deeplizardvlog
Facebook:   / deeplizard
Instagram:   / deeplizard
Twitter:   / deeplizard
Patreon:   / deeplizard
YouTube:    / deeplizard

🎓 Deep Learning with deeplizard:
Deep Learning Dictionary - https://deeplizard.com/course/ddcpailzrd
Deep Learning Fundamentals - https://deeplizard.com/course/dlcpailzrd
Learn TensorFlow - https://deeplizard.com/course/tfcpailzrd
Learn PyTorch - https://deeplizard.com/course/ptcpailzrd
Natural Language Processing - https://deeplizard.com/course/txtcpai...
Reinforcement Learning - https://deeplizard.com/course/rlcpailzrd
Generative Adversarial Networks - https://deeplizard.com/course/gacpailzrd

🎓 Other Courses:
DL Fundamentals Classic - https://deeplizard.com/learn/video/gZ...
Deep Learning Deployment - https://deeplizard.com/learn/video/SI...
Data Science - https://deeplizard.com/learn/video/d1...
Trading - https://deeplizard.com/learn/video/Zp...

🛒 Check out products deeplizard recommends on Amazon:
🔗 https://amazon.com/shop/deeplizard

🎵 deeplizard uses music by Kevin MacLeod
🔗    / @incompetech_kmac

❤️ Please use the knowledge gained from deeplizard content for good, not evil.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Expected Return - What Drives a Reinforcement Learning Agent in an MDP

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Policies and Value Functions - Good Actions for a Reinforcement Learning Agent

Policies and Value Functions - Good Actions for a Reinforcement Learning Agent

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

Введение в обучение с подкреплением

Введение в обучение с подкреплением

Глубокое Q-обучение — сочетание нейронных сетей и обучения с подкреплением

Глубокое Q-обучение — сочетание нейронных сетей и обучения с подкреплением

Introduction to Multi-Agent Reinforcement Learning

Introduction to Multi-Agent Reinforcement Learning

Markov Decision Processes

Markov Decision Processes

Монте-Карло и внеполитические методы | Обучение с подкреплением, часть 3

Монте-Карло и внеполитические методы | Обучение с подкреплением, часть 3

Reinforcement Learning - Developing Intelligent Agents

Reinforcement Learning - Developing Intelligent Agents

Reinforcement Learning with sparse rewards

Reinforcement Learning with sparse rewards

Policy Gradient Theorem Explained - Reinforcement Learning

Policy Gradient Theorem Explained - Reinforcement Learning

Лучший документальный фильм про создание ИИ

Лучший документальный фильм про создание ИИ

Обучение глубокой Q-сети — обучение с подкреплением

Обучение глубокой Q-сети — обучение с подкреплением

Объяснение памяти воспроизведения — опыт глубокого обучения Q-сети

Объяснение памяти воспроизведения — опыт глубокого обучения Q-сети

Так из чего же состоят электроны? Самые последние данные

Так из чего же состоят электроны? Самые последние данные

Stanford CS234: Reinforcement Learning | Winter 2019 | Lecture 1 - Introduction - Emma Brunskill

Stanford CS234: Reinforcement Learning | Winter 2019 | Lecture 1 - Introduction - Emma Brunskill

Stefano V. Albrecht - From Deep Reinforcement Learning to LLM-based Agents

Stefano V. Albrecht - From Deep Reinforcement Learning to LLM-based Agents

Overview of Deep Reinforcement Learning Methods

Overview of Deep Reinforcement Learning Methods

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Объяснение Q-Learning — метод обучения с подкреплением

Объяснение Q-Learning — метод обучения с подкреплением

Дороничев: ИИ — пузырь, который скоро ЛОПНЕТ. Какие перемены ждут мир?

Дороничев: ИИ — пузырь, который скоро ЛОПНЕТ. Какие перемены ждут мир?