Interview with NVIDIA Dynamo Architect Kyle Kranen
Автор: NVIDIA Developer
Загружено: 2025-03-18
Просмотров: 8387
Описание:
In this episode, Nader and Carter interview NVIDIA Dynamo architect Kyle Kranen to learn about what Dynamo is and how it can make models like DeepSeek-R1 increase throughput by up to 30x!
You have 3 levers when running inference on AI models: quality, cost, speed.
For example: reasoning models like DeepSeek-R1 do test-time scaling, where asking the model to think improves quality but reduces speed and increases costs.
We dive into how NVIDIA Dynamo gives you the ability to tweak all 3 levers through techniques like disaggregation, kv offloading, and kv routing.
Read: https://developer.nvidia.com/blog/int...
Follow Kyle ➡️ / kyle-kranen
Follow Carter ➡️ / carter-abdallah-958666140
Follow Nader ➡️ / naderlikeladder
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: