Reliability engineering for enterprise AI agents: evaluations, simulations and episodic memory

Автор: UiPath

Загружено: 2026-02-17

Просмотров: 1098

Описание: Shipping AI agents in the enterprise requires more than a convincing demo. https://uipath.ly/46f0drb

Featuring UiPath AI Leaders:
Taqi Jaffri, VP, Product Management
Scott Florentino, Director, Software Engineering

In this episode, we break down the core pillars of reliability engineering for enterprise AI agents: evaluations, simulations, and episodic memory. Taqi and Scott walk through the practical systems that move agents from prototype to production.

You’ll learn:
What agent evaluations actually measure (context, precision, faithfulness)
How deterministic evaluators compare to LLM-as-a-judge approaches
Why trajectory evaluation matters beyond simple input/output testing
How to simulate tool calls, escalations and failure conditions
What episodic memory is — and how it differs from fine-tuning
How agents improve over time using structured human feedback
How to measure reliability before release

If you're building AI agents that need to be auditable, measurable and production-ready, this episode focuses on the engineering discipline behind enterprise-grade deployment.

(This video was recorded in November 2025)

⏱ Timestamps

00:00 Why reliability engineering matters for enterprise AI agents
00:24 What are agent evaluations?
01:14 Context, precision and faithfulness explained
02:07 Evaluation sets and LLM-as-a-judge
03:17 Generating strong evaluation data
04:24 Trajectory evaluation vs black-box testing
05:30 Simulating tool calls and failure scenarios
07:08 What is episodic memory?
08:15 Episodic memory vs fine-tuning
09:14 Learning from human feedback
10:15 Managing and inspecting agent memory
11:30 Defense-in-depth for enterprise AI reliability
12:15 Proving performance with measurable reliability scores

🚀Join our community for more updates:
Academy: https://uipath.ly/462jPft
Blog: https://uipath.ly/3EwSRRC
LinkedIn: https://uipath.ly/44SelTOa
Facebook: https://uipath.ly/45A4naF
Forum: https://uipath.ly/3ReA7O3

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Reliability engineering for enterprise AI agents: evaluations, simulations and episodic memory

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Inside the UiPath context grounding architecture: hybrid RAG, chunking, and search

Inside the UiPath context grounding architecture: hybrid RAG, chunking, and search

Head of Claude Code: What happens after coding is solved | Boris Cherny

Head of Claude Code: What happens after coding is solved | Boris Cherny

Hear From Our Founders: The Personal Mission Behind MentorMatch AI

Hear From Our Founders: The Personal Mission Behind MentorMatch AI

Enterprise AI agents: the gap between prototype and production

Enterprise AI agents: the gap between prototype and production

OpenClaw Creator: Почему 80% приложений исчезнут

OpenClaw Creator: Почему 80% приложений исчезнут

NotebookLM: 5 КЕЙСОВ, которые заменят вам целую команду (БЕСПЛАТНО)

NotebookLM: 5 КЕЙСОВ, которые заменят вам целую команду (БЕСПЛАТНО)

4 типа задач, которые нужно немедленно передать ИИ

4 типа задач, которые нужно немедленно передать ИИ

Как ответить на вопросы про Kafka на интервью? Полный разбор

Как ответить на вопросы про Kafka на интервью? Полный разбор

How agentic orchestration enables ROI at enterprise scale

How agentic orchestration enables ROI at enterprise scale

Лекция от легенды ИИ в Стэнфорде

Лекция от легенды ИИ в Стэнфорде

Дарио Амодеи — «Мы близки к концу экспоненты»

Дарио Амодеи — «Мы близки к концу экспоненты»

Model Context Protocol (MCP) Explained for Beginners: AI Flight Booking Demo!

Model Context Protocol (MCP) Explained for Beginners: AI Flight Booking Demo!

7 AI Terms You Need to Know: Agents, RAG, ASI & More

7 AI Terms You Need to Know: Agents, RAG, ASI & More

От нуля до вашего первого ИИ-агента за 25 минут (без кодирования)

От нуля до вашего первого ИИ-агента за 25 минут (без кодирования)

We Studied 150 Developers Using AI (Here’s What's Actually Changed...)

We Studied 150 Developers Using AI (Here’s What's Actually Changed...)

Securing AI Agents with Zero Trust

Securing AI Agents with Zero Trust

GLM-5 УНИЧТОЖИЛА DeepSeek! Бесплатная нейросеть БЕЗ ограничений. Полный тест 2026

GLM-5 УНИЧТОЖИЛА DeepSeek! Бесплатная нейросеть БЕЗ ограничений. Полный тест 2026

Не создавайте агентов, а развивайте навыки – Барри Чжан и Махеш Мураг, Anthropic

Не создавайте агентов, а развивайте навыки – Барри Чжан и Махеш Мураг, Anthropic

Я создал финансовую модель из 11 вкладок за 10 минут. Инструмент за 20 долларов в месяц, который ...

Я создал финансовую модель из 11 вкладок за 10 минут. Инструмент за 20 долларов в месяц, который ...

ЛУЧШАЯ БЕСПЛАТНАЯ НЕЙРОСЕТЬ Google, которой нет аналогов

ЛУЧШАЯ БЕСПЛАТНАЯ НЕЙРОСЕТЬ Google, которой нет аналогов