Скачать
A Dive Into Multihead Attention, Self-Attention and Cross-Attention
Автор: Machine Learning Studio
Загружено: 2023-04-16
Просмотров: 55132
Описание:
In this video, I will first give a recap of Scaled Dot-Product Attention, and then dive into Multihead Attention. After that, we will see two different ways of using the attention mechanism, which is Self-Attention and Cross-Attention.
Solution of the exercise:
We have
X: T1xd
Y: T2xd
So, we build Q from Y, so that means Q will be
Q: T2xd
And we build K and V from X, therefore,
K: T1xd
V: T1xd
Then, QK^t (compatibility matrix) will be
QK^t: T2xT1
And the final output Z, will be Softmax(1/sqrt(d) QK^t) * V
Z: T2xd
Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...
Повторяем попытку...

Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: