Agentic Workload Inference at Scale: ByteDance’s AIBrix & DeerFlow | Ray Summit 2025
Автор: Anyscale
Загружено: 2025-11-18
Просмотров: 131
Описание:
At Ray Summit 2025, Henry Li and Liguang Xie from ByteDance share how they are shaping the next generation of LLM inference infrastructure with AIBrix—an open-source, Kubernetes- and Ray-powered control plane designed specifically for large-scale, production-grade language model workloads.
They begin by outlining the growing infrastructure challenges that come with deploying LLMs and agentic systems in real-world environments—where performance, scalability, and cost efficiency must all be optimized simultaneously. AIBrix addresses these challenges with a suite of LLM-focused capabilities co-developed with the vLLM community, including:
LLM-specific autoscaling for workload-aware, resource-efficient scaling
Smart KVCache management using multi-level caching, offloading, and prefix-aware reuse to reduce memory pressure and improve latency
Load- and cache-aware routing, enabling adaptive traffic distribution that remains fair and low-latency under real-world load patterns
The speakers also highlight recent innovations such as dynamic LoRA orchestration and support for heterogeneous hardware to maximize cost effectiveness across diverse clusters.
Finally, Henry and Liguang demonstrate how AIBrix powers advanced agentic workloads through DeerFlow, an open-source deep research framework. They showcase real-world use cases—including building a personal research assistant on open-source LLMs—and illustrate how AIBrix enables reliable, scalable, low-latency execution for next-generation AI agents.
Attendees will gain a deep understanding of AIBrix’s architecture, its performance breakthroughs, and its role in defining the future of enterprise-grade LLM infrastructure.
In this talk, they share how Pinterest unified sampling, labeling, and training into a single scalable pipeline—turning dataset iteration from a fundamental bottleneck into a catalyst for rapid model improvement.
Subscribe to our YouTube channel to stay up-to-date on the future of AI! / anyscale
🔗 Connect with us:
LinkedIn: / joinanyscale
X: https://x.com/anyscalecompute
Website: https://www.anyscale.com/
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: