What is a Transformer? (Transformer Walkthrough Part 1/2)
Автор: Neel Nanda
Загружено: 2023-05-01
Просмотров: 35114
Описание:
See part 2 here: Implementing GPT-2 from Scratch https://neelnanda.io/transformer-tuto...
Template notebook: https://neelnanda.io/transformer-temp...
Solution notebook: https://neelnanda.io/transformer-solu...
If you enjoyed this, I expect you'd enjoy learning more about what's actually going on inside these models and how to reverse engineer them! Check out:
A Comprehensive Mechanistic Interpretability Explainer & Glossary: https://www.neelnanda.io/glossary
Concrete Steps for Getting Started in Mechanistic Interpretability: https://www.neelnanda.io/getting-started
200 Concrete Open Problems in Mechanistic Interpretability: https://www.neelnanda.io/concrete-ope...
Further resources:
The transformers section of my MI explainer: https://dynalist.io/d/n2ZWtnoYHrU1s4v...
My TransformerLens library for doing mechanistic interpretability research on GPT-2 style language models: https://github.com/neelnanda-io/Trans...
My walkthrough of A Mathematical Framework for Transformer Circuits, for a deeper dive into how to think about transformers: • A Walkthrough of A Mathematical Framework ...
Check out these other intros to transformers for another perspective:
Jay Alammar's illustrated transformer: https://jalammar.github.io/illustrate...
Andrej Karpathy's MinGPT: https://github.com/karpathy/minGPT
Timestamps
00:00 Intro
03:12 Setup
07:38 What is the point of a transformer?
12:48 Tokens - Transformer Inputs
25:42 Logits – Transformer Outputs
36:29 Implementation/High level architecture/Embedding
39:08 Attention
45:03 MLPs
48:54 Unembedding
50:43 LayerNorm
59:16 Positional Information
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: