Speeding up training with FP8 and Triton
Автор: Yandex for Developers
Загружено: 2025-11-21
Просмотров: 431
Описание:
Vlad Savinov, Team Lead, YandexGPT pretraining, walked through the key principles behind speeding up model training.
He discussed how to profile GPU workloads, estimate theoretical performance limits and decide when custom Triton kernels are worthwhile. The session also highlighted the role of lower-precision formats such as BF16 and FP8 in achieving state-of-the-art training efficiency.
The event took place on November 13th at Yandex Hall, a public space for Armenia’s IT community by Yandex Armenia.
We look forward to seeing you at our upcoming events!
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: