Scaling Generative AI Inference with llm-d - DevConf.IN 2026
Автор: DevConf
Загружено: 2026-02-18
Просмотров: 14
Описание:
Title: Scaling Generative AI Inference with llm-d
Speaker(s): Dasharath Masirkar
---
Generative AI models are rapidly changing the landscape of application development, but deploying and serving these large models in production at scale presents significant challenges. llm-d is an open-source, Kubernetes-native distributed inference serving stack designed to address these complexities. This session will introduce developers to llm-d, demonstrating how it provides "well-lit paths" to serve large generative AI models with the fastest time-to-value and competitive performance across diverse hardware accelerators. Attendees will learn about llm-d's architecture, key features, and how to leverage its tested and benchmarked recipes for production deployments, focusing on practical applications and best practices.
---
Full schedule, including slides and other resources:
https://pretalx.devconf.info/devconf-...
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: