VIdeo Gen Platform Auto Scaling New H200 Node in 2 minutes
Автор: Stephen Li
Загружено: 2025-08-27
Просмотров: 15
Описание: It is hard for video generation companies to predict the user traffic spike. And it usually takes 15 to 20 minutes to spine up a new GPU server with 8 GPUs to accommodate those new traffics while users can't wait that long for a 5-6 second video. Because of that, Video Gen platforms have to over reserve GPU servers instead of taking a on-demand approach. In this video, we demonstrate how we build the inference platform for WAN to scale up and down within 2 minutes for on demand requests based on queuing status.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: