Build an LLM from Scratch 3: Coding attention mechanisms

LLMs

PyTorch

Large Language Models

AI

Artificial Intelligence

Python

Deep Learning

Coding

Code

Автор: Sebastian Raschka

Загружено: 2025-03-11

Просмотров: 37790

Описание: Links to the book:
https://amzn.to/4fqvn0D (Amazon)
https://mng.bz/M96o (Manning)

Link to the GitHub repository: https://github.com/rasbt/LLMs-from-sc...

This is a supplementary video explaining how attention mechanisms (self-attention, causal attention, multi-head attention) work by coding them from scratch.

00:00 3.3.1 A simple self-attention mechanism without trainable weights
41:01 3.3.2 Computing attention weights for all input tokens
52:40 3.4.1 Computing the attention weights step by step
1:12:33 3.4.2 Implementing a compact SelfAttention class
1:21:00 3.5.1 Applying a causal attention mask
1:32:33 3.5.2 Masking additional attention weights with dropout
1:38:05 3.5.3 Implementing a compact causal self-attention class
1:46:55 3.6.1 Stacking multiple single-head attention layers
1:58:55 3.6.2 Implementing multi-head attention with weight splits

You can find additional bonus materials on GitHub:

Comparing Efficient Multi-Head Attention Implementations, https://github.com/rasbt/LLMs-from-sc...

Understanding PyTorch Buffers, https://github.com/rasbt/LLMs-from-sc...

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Build an LLM from Scratch 3: Coding attention mechanisms

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео