Jędrzej Maczan - Online Softmax
Автор: Cohere
Загружено: 2026-06-05
Просмотров: 188
Описание:
00:00 Welcome and Speaker Intro
00:34 Softmax Basics Explained
02:11 Where Softmax Shows Up
03:09 Computing Softmax and Overflow
06:57 Safe Softmax Trick
08:03 Why Safe Softmax Works
10:02 Loop Cost and Stability Tradeoff
12:17 Online Softmax One Pass
16:40 Proof by Induction
22:07 Parallelizing on GPUs
24:43 Generalized Operator Setup
28:09 Commutativity Proof
30:35 Associativity Proof
37:14 CUDA Code and Flash Attention
39:21 Wrap Up and Q&A
The math behind softmax in FlashAttention
Jędrzej is a ML researcher, interested in the intersection of math and AI.
This session is brought to you by the Cohere Labs Open Science Community - a space where
ML researchers, engineers, linguists, social scientists, and lifelong learners connect and collaborate with each other. We'd like to extend a special thank you to Katrina Lawrence and Neel Ghoshal, Leads of our ML Math group for their dedication in organizing this event.
If you’re interested in sharing your work, we welcome you to join us! Simply fill out the form at https://forms.gle/ALND9i6KouEEpCnz6 to express your interest in becoming a speaker.
Join the Cohere Labs Open Science Community to see a full list of upcoming events (https://tinyurl.com/CohereLabsCommuni....
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: