PM Cookbook: Foundations of a Repeatable Evals Process
Автор: Maxim AI
Загружено: 2026-02-15
Просмотров: 53
Описание:
In this cookbook, we’ll help you lay the foundation for measuring and improving the quality of your AI agents while defining a repeatable, collaborative evaluation process across the AI development lifecycle.
Using a healthcare scribing agent as an example, we demonstrate how to compare prompt versions, run automated LLM evaluations on test datasets, trace agent workflows, and monitor production logs.
With these processes, product teams can measure AI quality with clear metrics, identify failure modes early, track latency and cost, and continuously evaluate production performance. If you're building LLM-powered features or agent workflows, this cookbook helps you ship reliable AI systems with confidence and control.
00:00 - Intro
01:58 - Prompt Engineering
07:42 - Offline Evals
13:30 - Analyze an Evaluation run report
18:37 - Observability
20:50 - Online Evals
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: