ycliper

Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
Скачать

Traits of next generation reasoning models

Автор: Nathan Lambert

Загружено: 2025-06-05

Просмотров: 6451

Описание: Current AI models are extremely skilled, which was seen as the step change in evaluation scores across the industry in the first half of 2025, but often fail when presented with even medium time-horizon tasks. This talk presents a taxonomy of 4 traits of reasoning models -- skills, calibration, strategy, and abstraction -- that will be crucial to creating the next generation of AI applications. With this, we focus on the latter two, strategy and abstraction, and discuss how these traits will enable long-horizon and reliable agents. The talk concludes with a scenario where these agentic behaviors are the foundation for RL continuing to scale in the coming years and post-training techniques reaching compute parity with pretraining methods sooner than later.

00:00 Everybody has reasoning models
03:18 Autonomy and the need for planning
04:53 Traits of reasoners
15:23 RL as the focus of LM development

Slides: https://docs.google.com/presentation/...
Written version: https://www.interconnects.ai/p/next-g...

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...
Traits of next generation reasoning models

Поделиться в:

Доступные форматы для скачивания:

Скачать видео

  • Информация по загрузке:

Скачать аудио

Похожие видео

Early stages of the reinforcement learning era of language models

Early stages of the reinforcement learning era of language models

How language model post-training is done today

How language model post-training is done today

A Taxonomy for Next-gen Reasoning — Nathan Lambert, Allen Institute (AI2) & Interconnects.ai

A Taxonomy for Next-gen Reasoning — Nathan Lambert, Allen Institute (AI2) & Interconnects.ai

Управление поведением LLM без тонкой настройки

Управление поведением LLM без тонкой настройки

Andrej Karpathy: Software Is Changing (Again)

Andrej Karpathy: Software Is Changing (Again)

GRPO's new variants and implementation secrets

GRPO's new variants and implementation secrets

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

Как подходить к постобучению в приложениях искусственного интеллекта

Как подходить к постобучению в приложениях искусственного интеллекта

Andrew Ng: State of AI Agents | LangChain Interrupt

Andrew Ng: State of AI Agents | LangChain Interrupt

Experimenting with Reinforcement Learning with Verifiable Rewards (RLVR)

Experimenting with Reinforcement Learning with Verifiable Rewards (RLVR)

Open-source AI (and LLMs): Definitions, Finding Nuance, and Policy

Open-source AI (and LLMs): Definitions, Finding Nuance, and Policy

Advanced Context Engineering for Agents

Advanced Context Engineering for Agents

Янн Лекун «Математические препятствия на пути к ИИ человеческого уровня»

Янн Лекун «Математические препятствия на пути к ИИ человеческого уровня»

The art of training a good (reasoning) language model

The art of training a good (reasoning) language model

The Misconception that Almost Stopped AI [How Models Learn Part 1]

The Misconception that Almost Stopped AI [How Models Learn Part 1]

Обучение с подкреплением для агентов — Уилл Браун, исследователь машинного обучения в Morgan Stanley

Обучение с подкреплением для агентов — Уилл Браун, исследователь машинного обучения в Morgan Stanley

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Recapping Open Models in 2025

Recapping Open Models in 2025

The most complex model we actually understand

The most complex model we actually understand

An Unexpected Reinforcement Learning Renaissance

An Unexpected Reinforcement Learning Renaissance

© 2025 ycliper. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]