Visualize the Transformers Multi-Head Attention in Action
Автор: learningcurve
Загружено: 2021-03-17
Просмотров: 30787
Описание:
We depict how a single layer Multi-Head Attention Network applies mathematical projections over Question-Answer data, following the Encoder-Decoder architecture discussed in the paper "Attention is all you Need" https://browse.arxiv.org/pdf/1706.037...
Attention Networks are used in modern AI technologies like BERT, GPTx, ChatGPT, etc. as it learns about relationships between different parts of the data that it encounters. The video provides conceptual depictions of what is happening 'under the hood' as abstract concepts in multi-dimensional space are manipulated during training and at inference time.
Python / PyTorch implementation referred to in this video:
https://github.com/learningcurveai/tr...
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: