[DDQN] Deep Reinforcement Learning with Double Q-learning | TDLS Foundational

Автор: LLMs Explained - Aggregate Intellect - AI.SCIENCE

Загружено: 2019-02-28

Просмотров: 7619

Описание: Toronto Deep Learning Series - Foundational Stream

https://tdls.a-i.science/events/2019-...

Deep Reinforcement Learning with Double Q-learning

"The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether they harm performance, and whether they can generally be prevented. In this paper, we answer all these questions affirmatively. In particular, we first show that the recent DQN algorithm, which combines Q-learning with a deep neural network, suffers from substantial overestimations in some games in the Atari 2600 domain. We then show that the idea behind the Double Q-learning algorithm, which was introduced in a tabular setting, can be generalized to work with large-scale function approximation. We propose a specific adaptation to the DQN algorithm and show that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games."

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

[DDQN] Deep Reinforcement Learning with Double Q-learning | TDLS Foundational

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

[AlphaGo Zero] Mastering the game of Go without human knowledge | TDLS

[AlphaGo Zero] Mastering the game of Go without human knowledge | TDLS

Deep Q-Learning paper explained: Human-level control through deep reinforcement learning (algorithm)

Deep Q-Learning paper explained: Human-level control through deep reinforcement learning (algorithm)

Reinforcement Learning in the Real World (with Professor Matthew Taylor)

Reinforcement Learning in the Real World (with Professor Matthew Taylor)

Почему работает теория шести рукопожатий? [Veritasium]

Почему работает теория шести рукопожатий? [Veritasium]

Основы управления контекстом: почему у людей с ограниченными возможностями развития возникают гал...

Основы управления контекстом: почему у людей с ограниченными возможностями развития возникают гал...

Nvidia CEO Jensen Huang: AI is going to fundamentally change how we compute everything

Nvidia CEO Jensen Huang: AI is going to fundamentally change how we compute everything

What Donald Trump's said about the Epstein files - The President’s Path podcast, BBC World Service

What Donald Trump's said about the Epstein files - The President’s Path podcast, BBC World Service

Prof. Markowski: Trump rozmontowuje świat. Europa została sama | Godzina z Jackiem #194

Prof. Markowski: Trump rozmontowuje świat. Europa została sama | Godzina z Jackiem #194

this makes me really upset

this makes me really upset

FM и DMR на Android?! Да, это реально | Blackview Xplore 1

FM и DMR на Android?! Да, это реально | Blackview Xplore 1

GPT 5.3 - this is it…

GPT 5.3 - this is it…

ELITY SIĘ DOIGRAŁY? Ziemkiewicz o

ELITY SIĘ DOIGRAŁY? Ziemkiewicz o "radosnej wspólnocie idiotów" i pieniądzach na obalenie rządu

Dlaczego marszałek Sejmu nie przechodzi lustracji? Jakubiak ujawnia problem | PPT 2/2

Dlaczego marszałek Sejmu nie przechodzi lustracji? Jakubiak ujawnia problem | PPT 2/2

OpenAI just dropped their Cursor killer

OpenAI just dropped their Cursor killer

Family Vault AI: Что если бы вы могли общаться со своими бабушками и дедушками вечно?

Family Vault AI: Что если бы вы могли общаться со своими бабушками и дедушками вечно?

Clawdbot just got scary (Moltbook)

Clawdbot just got scary (Moltbook)

От прототипа к производству: уроки создания реальной системы искусственного интеллекта на основе ...

От прототипа к производству: уроки создания реальной системы искусственного интеллекта на основе ...

ChatGPT in a kids robot does exactly what experts warned.

ChatGPT in a kids robot does exactly what experts warned.

Самое масштабное обновление Deno Deploy за всю историю.

Самое масштабное обновление Deno Deploy за всю историю.

Трудные уроки в разработке продуктов на основе ИИ: создание продуктов ИИ, которым люди действител...

Трудные уроки в разработке продуктов на основе ИИ: создание продуктов ИИ, которым люди действител...