Training Dynamics of Parametric and In-Context Knowledge Utilization in Language Models

Автор: Mayuresh Shilotri

Загружено: 2026-01-29

Просмотров: 4

Описание: Paper: https://arxiv.org/abs/2510.02370

Title: Training Dynamics of Parametric and In-Context Knowledge Utilization in Language Models

Authors: Minsung Kim, Dong-Kyum Kim, Jea Kwon, Nakyeong Yang, Kyomin Jung, Meeyoung Cha

Abstract: Large language models often encounter conflicts between in-context knowledge retrieved at inference time and parametric knowledge acquired during pretraining. Models that accept external knowledge uncritically are vulnerable to misinformation, whereas models that adhere rigidly to parametric knowledge fail to benefit from retrieval. Despite the widespread adoption of retrieval-augmented generation, we still lack a systematic understanding of what shapes knowledge-arbitration strategies during training. This gap risks producing pretrained models with undesirable arbitration behaviors and, consequently, wasting substantial computational resources after the pretraining budget has already been spent. To address this problem, we present the first controlled study of how training conditions influence models' use of in-context and parametric knowledge, and how they arbitrate between them. We train transformer-based language models on a synthetic biographies corpus while systematically controlling various conditions. Our experiments reveal that intra-document repetition of facts fosters the development of both parametric and in-context capabilities. Moreover, training on a corpus that contains inconsistent information or distributional skew encourages models to develop robust strategies for leveraging parametric and in-context knowledge. Rather than viewing these non-ideal properties as artifacts to remove, our results indicate that they are important for learning robust arbitration. These insights offer concrete, empirical guidance for pretraining models that harmoniously integrate parametric and in-context knowledge.

Tags: Machine Learning, Natural Language Processing, Research, gan, transformer, unsupervised, supervised, few-shot, gru, attention, search, training, dynamics, parametric, context, knowledge, research paper, academic, study, analysis, tutorial, explained, breakdown, paper review, research summary, AI research, scientific paper, methodology, results, findings, innovation, technology, computing, algorithm, model, dataset, evaluation, performance, accuracy, efficiency

Welcome to the Mayuresh Shilotri's Youtube . Maintained by Mayuresh Shilotri

You can follow me at

Blog - https://shilotri.com/

LinkedIn - / mayureshshilotri

Twitter - / mshilotri

Note: I only claim to have read the research paper and created a Video using AI tool. I am not the author. All intellectual heavy lifting was performed by the respective authors. 🙏

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Training Dynamics of Parametric and In-Context Knowledge Utilization in Language Models

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Reward Model Routing in Alignment

Reward Model Routing in Alignment

Aryna Sabalenka vs Elena Rybakina | Final | Australian Open 2026 Extended Highlights 🇦🇺

Aryna Sabalenka vs Elena Rybakina | Final | Australian Open 2026 Extended Highlights 🇦🇺

Когда газовая промышленность потерпела крах, мы выживали на солевых газах.

Когда газовая промышленность потерпела крах, мы выживали на солевых газах.

Nawet Tyson się go bał! Butterbean – najgroźniejszy nokauter wagi superciężkiej

Nawet Tyson się go bał! Butterbean – najgroźniejszy nokauter wagi superciężkiej

Я протестировал Google Genie 3... и это просто невероятно! (Практический обзор)

Я протестировал Google Genie 3... и это просто невероятно! (Практический обзор)

REBot From RAG to CatRAG with Semantic Enrichment and Graph Routing

REBot From RAG to CatRAG with Semantic Enrichment and Graph Routing

Обвал цен на 90%, изменивший всё.

Обвал цен на 90%, изменивший всё.

Moltbook is completely UNHINGED!!!

Moltbook is completely UNHINGED!!!

How an AI feedback loop threatens to break ChatGPT | Gary Marcus

How an AI feedback loop threatens to break ChatGPT | Gary Marcus

Обзор Xiaomi 17 Ultra by Leica — УЛЬТРА ХОРОШО?

Обзор Xiaomi 17 Ultra by Leica — УЛЬТРА ХОРОШО?

WEIRD THINGS start to happen when you study like this…| Knowledge Mastery

WEIRD THINGS start to happen when you study like this…| Knowledge Mastery

AI Spending Delivers Mixed Results to Stocks | Bloomberg Tech 1/29/2026

AI Spending Delivers Mixed Results to Stocks | Bloomberg Tech 1/29/2026

The Reasoning Boundary Paradox How Reinforcement Learning Constrains Language Models

The Reasoning Boundary Paradox How Reinforcement Learning Constrains Language Models

Этот «блинчатый» двигатель может сделать электромобили невероятно быстрыми (Mercedes его купил).

Этот «блинчатый» двигатель может сделать электромобили невероятно быстрыми (Mercedes его купил).

Scientifically Approved 5 Memory Hacks | Explained Briefly

Scientifically Approved 5 Memory Hacks | Explained Briefly

The Top Secret Manufacturing of Optimus: Inside Tesla’s Robot Factory (Full Process)

The Top Secret Manufacturing of Optimus: Inside Tesla’s Robot Factory (Full Process)

Германия только что создала машину, которая могла бы бесконечно обеспечивать энергией всю планету.

Германия только что создала машину, которая могла бы бесконечно обеспечивать энергией всю планету.

HANRAG: Heuristic Accurate Noise-resistant Retrieval-Augmented Generation for Multi-hop Q&A

HANRAG: Heuristic Accurate Noise-resistant Retrieval-Augmented Generation for Multi-hop Q&A

SpaceX Dragon Just Achieved Something Even NASA's Best Spacecrafts Couldn't.

SpaceX Dragon Just Achieved Something Even NASA's Best Spacecrafts Couldn't.

Time-To-Inconsistency A Survival Analysis of Large Language Model Robustness to Adversarial Attacks

Time-To-Inconsistency A Survival Analysis of Large Language Model Robustness to Adversarial Attacks