KServe Next: Advancing Generative AI Model Serving - Yuan Tang, Red Hat & Dan Sun, Bloomberg
Автор: CNCF [Cloud Native Computing Foundation]
Загружено: 2025-11-24
Просмотров: 324
Описание:
Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon events in Amsterdam, The Netherlands (23-26 March, 2026). Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at https://kubecon.io
KServe Next: Advancing Generative AI Model Serving - Yuan Tang, Red Hat & Dan Sun, Bloomberg
As generative AI rapidly reshapes the AI landscape, the need for scalable, efficient, and interoperable model serving infrastructure has never been greater. In this session, we’ll trace the journey from bespoke model deployment patterns to modern, Kubernetes-native serving platforms. We'll dive into the latest challenges in deploying and scaling large language models (LLMs) — including inference performance, KV-cache management, distributed execution, and cost optimization.
We are thrilled to announce the release of KServe v0.17, a major milestone introducing enhanced support for generative AI workloads: a dedicated LLMInferenceService CRD tailored for LLM-serving capabilities (e.g., disaggregated serving), model and KV caching, and integration with the open source Envoy AI Gateway.
Attendees will gain insights into the technologies powering the next wave of AI applications and learn how to prepare their infrastructure for a generative AI future.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: