EMPO2: Exploratory Memory-Augmented LLM Agents via Hybrid RL Optimization
Автор: Research Paper Review
Загружено: 2026-02-28
Просмотров: 20
Описание:
We propose a new reinforcement learning framework called EMPO² to innovatively improve the search ability of the giant language model (LLM) agent. Existing agents relied only on prior knowledge to limit the unfamiliar environment, but this method combines non-parametric external memory and parameter updates to induce autonomous learning from past failures. Agents use self-generated *reflective tips* to reduce trial and error, and systematically internalize these guides into the model through the off-polish knowledge distillation process. As a result of the experiment, complex benchmarks such as ScienceWorld and WebShop demonstrated more than twice the performance improvement and excellent adaptability than traditional algorithms. As a result, this technology shows that agents can achieve long-term evolution through self-directed search without external help.
https://arxiv.org/pdf/2602.23008
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: