What is a Transformer? (Transformer Walkthrough Part 1/2)

Автор: Neel Nanda

Загружено: 2023-05-01

Просмотров: 35114

Описание: See part 2 here: Implementing GPT-2 from Scratch https://neelnanda.io/transformer-tuto...

Template notebook: https://neelnanda.io/transformer-temp...
Solution notebook: https://neelnanda.io/transformer-solu...

If you enjoyed this, I expect you'd enjoy learning more about what's actually going on inside these models and how to reverse engineer them! Check out:
A Comprehensive Mechanistic Interpretability Explainer & Glossary: https://www.neelnanda.io/glossary
Concrete Steps for Getting Started in Mechanistic Interpretability: https://www.neelnanda.io/getting-started
200 Concrete Open Problems in Mechanistic Interpretability: https://www.neelnanda.io/concrete-ope...

Further resources:
The transformers section of my MI explainer: https://dynalist.io/d/n2ZWtnoYHrU1s4v...
My TransformerLens library for doing mechanistic interpretability research on GPT-2 style language models: https://github.com/neelnanda-io/Trans...
My walkthrough of A Mathematical Framework for Transformer Circuits, for a deeper dive into how to think about transformers: • A Walkthrough of A Mathematical Framework ...
Check out these other intros to transformers for another perspective:
Jay Alammar's illustrated transformer: https://jalammar.github.io/illustrate...
Andrej Karpathy's MinGPT: https://github.com/karpathy/minGPT

Timestamps
00:00 Intro
03:12 Setup
07:38 What is the point of a transformer?
12:48 Tokens - Transformer Inputs
25:42 Logits – Transformer Outputs
36:29 Implementation/High level architecture/Embedding
39:08 Attention
45:03 MLPs
48:54 Unembedding
50:43 LayerNorm
59:16 Positional Information

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

What is a Transformer? (Transformer Walkthrough Part 1/2)

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео