Building Transformer Attention Mechanism from Scratch: Step-by-Step Coding Guide, part 1

Автор: Code Surge

Загружено: 2024-10-23

Просмотров: 825

Описание: Part 2 is published !
• Understanding Transformers & Attention: Ho...

In this video, I'll guide you through coding the encoder part of Transformer from scratch using only NumPy—no high-level libraries like TensorFlow or PyTorch.

This is a core element of architectures such as DeepSeek which has a huge impact on markets.

Here's what we'll cover:

**Vocabulary & Tokenization – I'll start by creating a vocabulary and tokenizing input text into manageable tokens.

** Word Embedding – Next, I'll transform tokens into meaningful word embeddings for use in the attention mechanism.

**Attention Mechanism – We'll code the multi-head self-attention step-by-step, calculating the queries, keys, and values. I explain what is the
intuition behind creating these matrices and their role in the network.

**Transformer Encoder – Finally, wI'll build the encoder part of the Transformer, focusing on how attention integrates into the overall architecture.

In the next part, I will explore the decoder part of the Transformer, and I will try to trian a model for NLP.

00:00 introduction
1:20 vocabulary
5:22 sequences
9:25 embedding vector
14:43 Transformer encoder

#transformer #deepseek #python

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Building Transformer Attention Mechanism from Scratch: Step-by-Step Coding Guide, part 1

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео