【生成式AI時代下的機器學習(2025)】助教課:利用多張GPU訓練大型語言模型—從零開始介紹DeepSpeed、Liger Kernel、Flash Attention及Quantization
Автор: Hung-yi Lee
Загружено: 2025-03-29
Просмотров: 37629
Описание:
投影片連結(打開Excalidraw匯入即可):https://drive.google.com/file/d/1pKgY...
歡迎同學到slido上匿名發問(連結會在4/4失效):https://app.sli.do/event/o69HrUYmKJcP...
Instructor
Hsiu-Hsuan Wang(王秀軒)
Find more at https://anthony-wss.github.io/
Chapters
00:00 存取投影片、slido問問題
02:20 overview
04:38 introduction
20:20 DeepSpeed
36:10 flash attention
41:15 liger kernel
44:36 quantization
47:05 take away & recommended reading
48:45 Q&A
Links & Recommended Reading
flash attention: https://github.com/Dao-AILab/flash-at...
liger kernel: https://github.com/linkedin/Liger-Kernel
TWCC實驗code: https://github.com/anthony-wss/deepsp...
ultra-scale playbook: https://huggingface.co/spaces/nanotro...
transformers deepspeed docs: https://huggingface.co/docs/transform...
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: