Secure & Scalable AI on Ray + Kubernetes: Google’s Decoupled Agent Pattern | Ray Summit 2025

Автор: Anyscale

Загружено: 2025-11-20

Просмотров: 662

Описание: At Ray Summit 2025, Alex Bulankou and Brandon Royal from Google share how to bring agentic AI systems out of the lab and into production through the Decoupled Agent Pattern—a scalable, resilient, and secure architecture built on Ray and Kubernetes.

They begin by outlining the core production challenge of agentic systems: integrating LLMs, tools, and long-lived stateful agents while ensuring security, elasticity, and high-throughput execution. Traditional architectures struggle to balance these constraints. The Decoupled Agent Pattern solves this by cleanly separating the stateful agent logic from the stateless, scalable tools it invokes.

At the heart of this pattern:

The agent’s core logic runs as a durable Ray Actor, with lifecycle and placement managed by Ray’s Global Control Store (GCS) for high availability.

Tools are executed as thousands of stateless Ray Tasks, enabling massive parallelism and elasticity.

Untrusted or dynamically generated code runs in gVisor sandboxes, providing kernel-level isolation without compromising throughput—made possible through Kubernetes’ secure runtime capabilities.

Alex and Brandon demonstrate the architecture with a series of live scenarios, including a financial analysis agent running on a Ray cluster on Google Kubernetes Engine (GKE).

They then show how the architecture leverages deep Kubernetes-native integrations:

KubeRay’s topology-aware placement allows Ray to understand node-level characteristics, enabling optimal scheduling.

This unlocks intelligent capacity management with tools like Kueue for cost-efficient batch scheduling.

And it provides a clear pathway to mission-critical resilience, supporting zero-downtime upgrades and fault-tolerant agent execution.

Attendees will leave with a practical blueprint for deploying agentic AI systems in production—combining Ray’s distributed computing strengths with Kubernetes’ security and orchestration capabilities to build scalable, resilient, and secure agentic runtimes.

Liked this video? Check out other Ray Summit breakout session recordings    • Ray Summit 2025 - Breakout Sessions

Subscribe to our YouTube channel to stay up-to-date on the future of AI!    / anyscale

🔗 Connect with us:
LinkedIn:   / joinanyscale
X: https://x.com/anyscalecompute
Website: https://www.anyscale.com/

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Secure & Scalable AI on Ray + Kubernetes: Google’s Decoupled Agent Pattern | Ray Summit 2025

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Scaling Post-Training Workflows with Ray Data, Ray Data LLM, and vLLM | Ray Summit 2025

Scaling Post-Training Workflows with Ray Data, Ray Data LLM, and vLLM | Ray Summit 2025

Evolving Ray Core: Scalability, Reliability, and Compiled Graphs | Ray Summit 202

Evolving Ray Core: Scalability, Reliability, and Compiled Graphs | Ray Summit 202

Brendan Burns: Lessons from Building Kubernetes and the Future of AI Infrastructure

Brendan Burns: Lessons from Building Kubernetes and the Future of AI Infrastructure

How xAI Scales Image & Video Processing with Ray | Ray Summit 2025

How xAI Scales Image & Video Processing with Ray | Ray Summit 2025

Инфраструктура ИИ | Часть 3 | Выполнение задач ИИ в реальном времени: устранение задержек и сниже...

Инфраструктура ИИ | Часть 3 | Выполнение задач ИИ в реальном времени: устранение задержек и сниже...

Prompt Learning: A Reinforcement Learning-Inspired Approach to AI Optimization | Ray Summit 2025

Prompt Learning: A Reinforcement Learning-Inspired Approach to AI Optimization | Ray Summit 2025

Kubernetes — Простым Языком на Понятном Примере

Kubernetes — Простым Языком на Понятном Примере

Microsoft Foundry — фабрика приложений и агентов на основе искусственного интеллекта.

Microsoft Foundry — фабрика приложений и агентов на основе искусственного интеллекта.

How Coinbase Uses Ray, vLLM & LiteLLM to Power Secure LLM Services | Ray Summit 2025

How Coinbase Uses Ray, vLLM & LiteLLM to Power Secure LLM Services | Ray Summit 2025

Лучший документальный фильм про создание ИИ

Лучший документальный фильм про создание ИИ

Основной доклад: Ray: распределенная вычислительная система для искусственного интеллекта — Робер...

Основной доклад: Ray: распределенная вычислительная система для искусственного интеллекта — Робер...

OpenAI Is Slowing Hiring. Anthropic's Engineers Stopped Writing Code. Here's Why You Should Care.

OpenAI Is Slowing Hiring. Anthropic's Engineers Stopped Writing Code. Here's Why You Should Care.

Distributed Model Training with Ray at Capital One | Ray Summit 2025

Distributed Model Training with Ray at Capital One | Ray Summit 2025

How Runhouse Orchestrates Multi-Cluster Ray Workloads | Ray Summit 2025

How Runhouse Orchestrates Multi-Cluster Ray Workloads | Ray Summit 2025

Самая сложная модель из тех, что мы реально понимаем

Самая сложная модель из тех, что мы реально понимаем

Webinar: Getting Started with Distributed Training at Scale

Webinar: Getting Started with Distributed Training at Scale

The Rise of vLLM: Building an Open Source LLM Inference Engine

The Rise of vLLM: Building an Open Source LLM Inference Engine

Kubernetes: объяснение за 15 минут | Практическое занятие

Kubernetes: объяснение за 15 минут | Практическое занятие

Claude Code: полный гайд по AI-кодингу (хаки, техники и секреты)

Claude Code: полный гайд по AI-кодингу (хаки, техники и секреты)

Cursor AI: полный гайд по вайб-кодингу (настройки, фишки, rules, MCP)

Cursor AI: полный гайд по вайб-кодингу (настройки, фишки, rules, MCP)