Build an LLM from Scratch 3: Coding attention mechanisms
Автор: Sebastian Raschka
Загружено: 2025-03-11
Просмотров: 37790
Описание:
Links to the book:
https://amzn.to/4fqvn0D (Amazon)
https://mng.bz/M96o (Manning)
Link to the GitHub repository: https://github.com/rasbt/LLMs-from-sc...
This is a supplementary video explaining how attention mechanisms (self-attention, causal attention, multi-head attention) work by coding them from scratch.
00:00 3.3.1 A simple self-attention mechanism without trainable weights
41:01 3.3.2 Computing attention weights for all input tokens
52:40 3.4.1 Computing the attention weights step by step
1:12:33 3.4.2 Implementing a compact SelfAttention class
1:21:00 3.5.1 Applying a causal attention mask
1:32:33 3.5.2 Masking additional attention weights with dropout
1:38:05 3.5.3 Implementing a compact causal self-attention class
1:46:55 3.6.1 Stacking multiple single-head attention layers
1:58:55 3.6.2 Implementing multi-head attention with weight splits
You can find additional bonus materials on GitHub:
Comparing Efficient Multi-Head Attention Implementations, https://github.com/rasbt/LLMs-from-sc...
Understanding PyTorch Buffers, https://github.com/rasbt/LLMs-from-sc...
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: