Fine-Tuning Language Models with Reinforcement Learning with Michael Albada
Автор: O'Reilly
Загружено: 2026-01-23
Просмотров: 214
Описание:
Watch the entire Superstream: https://learning.oreilly.com/videos/a...
Building reliable AI systems means going beyond prompt engineering. In this AI Superstream session, Microsoft's Michael Albada explores how fine-tuning language models with reinforcement learning can deliver greater accuracy, control, and cost efficiency. You'll see how open weight models are closing the gap with proprietary options and how new techniques like low-rank adaptation (LoRA) make fine-tuning more practical than ever.
Michael also breaks down when fine-tuning is the right choice compared to RAG or off-the-shelf APIs. Using the Glaive Function Calling dataset, he demonstrates how reinforcement learning with verifiable rewards can shape model behavior, improve structured outputs, and support real-world use cases that demand reliability and domain-specific performance.
Follow O'Reilly on:
LinkedIn: / oreilly
Facebook: / oreilly
Instagram: / oreillymedia
BlueSky: https://bsky.app/profile/oreilly.bsky...
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: