ycliper

Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
Скачать

Don’t lie to your friends: Learning what you know from collaborative self-play

Автор: Conference on Language Modeling

Загружено: 2025-11-03

Просмотров: 103

Описание: Authors: Jacob Eisenstein, Reza Aghajani, Adam Fisch, Dheeru Dua, Fantine Huot, Mirella Lapata, Vicky Zayats, Jonathan Berant

To be helpful assistants, AI agents must be aware of their own capabilities and limitations. This includes knowing when to answer from parametric knowledge versus using tools, when to trust tool outputs, and when to abstain or hedge. Such capabilities are hard to teach through supervised fine-tuning because they require constructing examples that reflect the agents specific capabilities. We therefore propose a radically new approach to teaching agents what they know: \emph{collaborative self-play}. We construct multi-agent collaborations in which the group is rewarded for collectively arriving at correct answers. The desired meta-knowledge emerges from the incentives built into the structure of the interaction. We focus on small societies of agents that have access to heterogeneous tools (corpus-specific retrieval), and therefore must collaborate to maximize their success with minimal effort. Experiments show that group-level rewards for multi-agent communities can induce policies that \emph{transfer} to improve tool use and selective prediction in single-agent scenarios.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...
Don’t lie to your friends: Learning what you know from collaborative self-play

Поделиться в:

Доступные форматы для скачивания:

Скачать видео

  • Информация по загрузке:

Скачать аудио

Похожие видео

Fluid Language Model Benchmarking

Fluid Language Model Benchmarking

Gillian Hadfield - Alignment is social: lessons from human alignment for AI

Gillian Hadfield - Alignment is social: lessons from human alignment for AI

Hidden in plain sight: VLMs overlook their visual representations

Hidden in plain sight: VLMs overlook their visual representations

Tom Griffiths - Mapping the Jagged Edges of AI with Cognitive Science

Tom Griffiths - Mapping the Jagged Edges of AI with Cognitive Science

Mamba Language Model Simplified In JUST 5 MINUTES!

Mamba Language Model Simplified In JUST 5 MINUTES!

30 Years of Business Advice in 13 Minutes (from a Billionaire)

30 Years of Business Advice in 13 Minutes (from a Billionaire)

FineWeb2: One Pipeline to Scale Them All — Adapting Pre-Training Data Processing to Every Language

FineWeb2: One Pipeline to Scale Them All — Adapting Pre-Training Data Processing to Every Language

Language models align with brain regions that represent concepts across modalities

Language models align with brain regions that represent concepts across modalities

Single-Pass Document Scanning for Question Answering

Single-Pass Document Scanning for Question Answering

Luke Zettlemoyer - Mixed-modal Language Modeling

Luke Zettlemoyer - Mixed-modal Language Modeling

Яндекс Плюс: как ПОДСАДИЛИ 45 млн человек на подписку, от которой нельзя уйти?

Яндекс Плюс: как ПОДСАДИЛИ 45 млн человек на подписку, от которой нельзя уйти?

ICQuant: Index Coding enables Low-bit LLM Quantization

ICQuant: Index Coding enables Low-bit LLM Quantization

ЖИЗНЬ СОЛО. ПОЧЕМУ?

ЖИЗНЬ СОЛО. ПОЧЕМУ?

Understanding R1-Zero-Like Training: A Critical Perspective

Understanding R1-Zero-Like Training: A Critical Perspective

Anthropic just BANNED OpenClaw...

Anthropic just BANNED OpenClaw...

Shared Global and Local Geometry of Language Model Embeddings

Shared Global and Local Geometry of Language Model Embeddings

Claude Cowork: The AI That Actually Does Your Work

Claude Cowork: The AI That Actually Does Your Work

Lecture 1 - Fuchs - manifolds

Lecture 1 - Fuchs - manifolds

PrefPalette: Personalized Preference Modeling with Latent Attributes

PrefPalette: Personalized Preference Modeling with Latent Attributes

MixAssist: An Audio-Language Dataset for Co-Creative AI Assistance in Music Mixing

MixAssist: An Audio-Language Dataset for Co-Creative AI Assistance in Music Mixing

© 2025 ycliper. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]