Scale AI Agent Evaluation with NVIDIA NeMo Evaluator LLM-as-a-Judge
Автор: NVIDIA Developer
Загружено: 2025-08-20
Просмотров: 1248
Описание:
In this step-by-step tutorial, you’ll discover how to scale your AI agent evaluation workflows with NVIDIA NeMo Evaluator LLM-as-a-Judge.
This video walks you through how to:
✅ Install NVIDIA NIM Operator and setting up Prometheus
✅ Set up LLM-as-a-Judge with NeMo Evaluator using Docker compose
✅ Configure evaluation config with Llama Nemotron Nano 1.1 4B model
✅ Scale the evaluation to multiple GPUs
NVIDIA NeMo Evaluator microservice simplifies the end-to-end evaluation of generative AI applications, including LLM evaluation, retrieval-augmented generation (RAG) evaluation, and AI agent evaluation with an easy-to-use API. It provides LLM-as-a-judge capabilities, along with a comprehensive suite of LLM benchmarks and LLM metrics for a wide range of custom tasks and domains, including reasoning, coding, and instruction-following.
Get started:
✅ Follow the official documentation to scale using NIM Operator: https://docs.nvidia.com/nim-operator/...
✅ Download included Jupyter notebooks to replicate the evaluation workflows in your own environment: https://github.com/NVIDIA/GenerativeA...
✅ Learn more about NeMo Evaluator: https://nvda.ws/414aHXJ
00:00 - Introduction
01:00 - Building a Manifest for a NIM
01:52 - Run Command Get Pods
3:00 - Standard NeMo Evaluator Notebook
4:03 - Creating a Job
4:22 - Changing Configurations
5:00 - Apply YAML File to NIM Operator Instance
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: