Policies and Value Functions - Good Actions for a Reinforcement Learning Agent

Автор: deeplizard

Загружено: 2018-09-27

Просмотров: 107218

Описание: 💡Enroll to gain access to the full course:
https://deeplizard.com/course/rlcpailzrd

Welcome back to this series on reinforcement learning! In this video, we're going to pick up where we left off with Markov Decision Processes and discuss the topics of policies and value functions. This will give us a way to measure “how good” it is for an agent to be in a given state or to select a given action.

Sources:
Reinforcement Learning: An Introduction, Second Edition by Richard S. Sutton and Andrew G. Bartow
http://incompleteideas.net/book/RLboo...

Playing Atari with Deep Reinforcement Learning by Deep Mind Technologies
https://www.cs.toronto.edu/~vmnih/doc...

🕒🦎 VIDEO SECTIONS 🦎🕒

00:00 Welcome to DEEPLIZARD - Go to deeplizard.com for learning resources
00:30 Help deeplizard add video timestamps - See example in the description
06:22 Collective Intelligence and the DEEPLIZARD HIVEMIND

💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥

👋 Hey, we're Chris and Mandy, the creators of deeplizard!

👉 Check out the website for more learning material:
🔗 https://deeplizard.com

💻 ENROLL TO GET DOWNLOAD ACCESS TO CODE FILES
🔗 https://deeplizard.com/resources

🧠 Support collective intelligence, join the deeplizard hivemind:
🔗 https://deeplizard.com/hivemind

🧠 Use code DEEPLIZARD at checkout to receive 15% off your first Neurohacker order
👉 Use your receipt from Neurohacker to get a discount on deeplizard courses
🔗 https://neurohacker.com/shop?rfsn=648...

👀 CHECK OUT OUR VLOG:
🔗    / deeplizardvlog

❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind:
Tammy
Mano Prime
Ling Li

🚀 Boost collective intelligence by sharing this video on social media!

👀 Follow deeplizard:
Our vlog:    / deeplizardvlog
Facebook:   / deeplizard
Instagram:   / deeplizard
Twitter:   / deeplizard
Patreon:   / deeplizard
YouTube:    / deeplizard

🎓 Deep Learning with deeplizard:
Deep Learning Dictionary - https://deeplizard.com/course/ddcpailzrd
Deep Learning Fundamentals - https://deeplizard.com/course/dlcpailzrd
Learn TensorFlow - https://deeplizard.com/course/tfcpailzrd
Learn PyTorch - https://deeplizard.com/course/ptcpailzrd
Natural Language Processing - https://deeplizard.com/course/txtcpai...
Reinforcement Learning - https://deeplizard.com/course/rlcpailzrd
Generative Adversarial Networks - https://deeplizard.com/course/gacpailzrd

🎓 Other Courses:
DL Fundamentals Classic - https://deeplizard.com/learn/video/gZ...
Deep Learning Deployment - https://deeplizard.com/learn/video/SI...
Data Science - https://deeplizard.com/learn/video/d1...
Trading - https://deeplizard.com/learn/video/Zp...

🛒 Check out products deeplizard recommends on Amazon:
🔗 https://amazon.com/shop/deeplizard

🎵 deeplizard uses music by Kevin MacLeod
🔗    / @incompetech_kmac

❤️ Please use the knowledge gained from deeplizard content for good, not evil.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Policies and Value Functions - Good Actions for a Reinforcement Learning Agent

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Чему учатся алгоритмы обучения с подкреплением — оптимальные политики

Чему учатся алгоритмы обучения с подкреплением — оптимальные политики

Policy Gradient Theorem Explained - Reinforcement Learning

Policy Gradient Theorem Explained - Reinforcement Learning

Model Based Reinforcement Learning: Policy Iteration, Value Iteration, and Dynamic Programming

Model Based Reinforcement Learning: Policy Iteration, Value Iteration, and Dynamic Programming

Исследование против эксплуатации: изучение оптимальной политики обучения с подкреплением

Исследование против эксплуатации: изучение оптимальной политики обучения с подкреплением

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Монте-Карло и внеполитические методы | Обучение с подкреплением, часть 3

Монте-Карло и внеполитические методы | Обучение с подкреплением, часть 3

Why Choose Model-Based Reinforcement Learning?

Why Choose Model-Based Reinforcement Learning?

Inverse Reinforcement Learning Explained

Inverse Reinforcement Learning Explained

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Reinforcement Learning: Machine Learning Meets Control Theory

Reinforcement Learning: Machine Learning Meets Control Theory

Обучение с подкреплением с нуля

Обучение с подкреплением с нуля

Expected Return - What Drives a Reinforcement Learning Agent in an MDP

Expected Return - What Drives a Reinforcement Learning Agent in an MDP

Reinforcement Learning: on-policy vs off-policy algorithms

Reinforcement Learning: on-policy vs off-policy algorithms

Monte Carlo in Reinforcement Learning

Monte Carlo in Reinforcement Learning

Обучение с подкреплением, по книге

Обучение с подкреплением, по книге

Градиентный спуск, как обучаются нейросети | Глава 2, Глубинное обучение

Градиентный спуск, как обучаются нейросети | Глава 2, Глубинное обучение

Q-Learning: Model Free Reinforcement Learning and Temporal Difference Learning

Q-Learning: Model Free Reinforcement Learning and Temporal Difference Learning

Policy and Value Iteration

Policy and Value Iteration

Ад на Ближнем Востоке

Ад на Ближнем Востоке

Многослойный перцептрон и функции активации (ReLU и GELU). Объяснение.

Многослойный перцептрон и функции активации (ReLU и GELU). Объяснение.