Building Transformer Attention Mechanism from Scratch: Step-by-Step Coding Guide, part 1
Автор: Code Surge
Загружено: 2024-10-23
Просмотров: 825
Описание:
Part 2 is published !
• Understanding Transformers & Attention: Ho...
In this video, I'll guide you through coding the encoder part of Transformer from scratch using only NumPy—no high-level libraries like TensorFlow or PyTorch.
This is a core element of architectures such as DeepSeek which has a huge impact on markets.
Here's what we'll cover:
**Vocabulary & Tokenization – I'll start by creating a vocabulary and tokenizing input text into manageable tokens.
** Word Embedding – Next, I'll transform tokens into meaningful word embeddings for use in the attention mechanism.
**Attention Mechanism – We'll code the multi-head self-attention step-by-step, calculating the queries, keys, and values. I explain what is the
intuition behind creating these matrices and their role in the network.
**Transformer Encoder – Finally, wI'll build the encoder part of the Transformer, focusing on how attention integrates into the overall architecture.
In the next part, I will explore the decoder part of the Transformer, and I will try to trian a model for NLP.
00:00 introduction
1:20 vocabulary
5:22 sequences
9:25 embedding vector
14:43 Transformer encoder
#transformer #deepseek #python
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: