DeReCo: Decoupling Representation and Coordination Learning for Object-Adaptive Decentralized Multi~

Автор: NAIST Robot Learning Lab

Загружено: 2026-03-10

Просмотров: 200

Описание: DeReCo: Decoupling Representation and Coordination Learning for Object-Adaptive Decentralized Multi-Robot Cooperative Transport

Kazuki Shibata, Ryosuke Sota, Shandil Dhiresh Bosch, Yuki Kadokawa, Tsurumine Yoshihisa, and Takamitsu Matsubara

https://arxiv.org/abs/2603.08111

Generalizing decentralized multi-robot cooperative transport across objects with diverse shapes and physical properties remains a fundamental challenge. Under decentralized execution, two key challenges arise: object-dependent representation learning under partial observability and coordination learning in multi-agent reinforcement learning (MARL) under non-stationarity. A typical approach jointly optimizes object-dependent representations and coordinated policies in an end-to-end manner while randomizing object shapes and physical properties during training. However, this joint optimization tightly couples representation and coordination learning, introducing bidirectional interference: inaccurate representations under partial observability destabilize coordination learning, while non-stationarity in MARL further degrades representation learning, resulting in sample-inefficient training. To address this structural coupling, we propose DeReCo, a novel MARL framework that decouples representation and coordination learning for object-adaptive multi-robot cooperative transport, improving sample efficiency and generalization across objects and transport scenarios. DeReCo adopts a three-stage training strategy: (1) centralized coordination learning with privileged object information, (2) reconstruction of object-dependent representations from local observations, and (3) progressive removal of privileged information for decentralized execution. This decoupling mitigates interference between representation and coordination learning and enables stable and sample-efficient training. Experimental results show that DeReCo outperforms baselines in simulation on three training objects, generalizes to six unseen objects with varying masses and friction coefficients, and achieves superior performance on two unseen objects in real-robot experiments. The demonstration video is available at https://sites.google.com/view/multi-h....

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

DeReCo: Decoupling Representation and Coordination Learning for Object-Adaptive Decentralized Multi~

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Роботы, Которых Никто Не Ожидал Увидеть на CES 2026

Роботы, Которых Никто Не Ожидал Увидеть на CES 2026

Как Сделать Настольный ЭЛЕКТРОЭРОЗИОННЫЙ Станок?

Как Сделать Настольный ЭЛЕКТРОЭРОЗИОННЫЙ Станок?

This Paradox Splits Smart People 50/50

This Paradox Splits Smart People 50/50

CoLF: Learning Consistent Leader-Follower Policies for Vision-Language-Guided Multi-Robot ~

CoLF: Learning Consistent Leader-Follower Policies for Vision-Language-Guided Multi-Robot ~

DIPP: Discriminative Impact Point Predictor for Catching Diverse In-Flight Objects

DIPP: Discriminative Impact Point Predictor for Catching Diverse In-Flight Objects

Как работает ГАЛЬВАНИЧЕСКАЯ РАЗВЯЗКА? Оптрон, трансформатор. Понятное объяснение!

Как работает ГАЛЬВАНИЧЕСКАЯ РАЗВЯЗКА? Оптрон, трансформатор. Понятное объяснение!

Парадоксы велосипеда

Парадоксы велосипеда

Физики нашли способ объяснить реальность… и он пугает

Физики нашли способ объяснить реальность… и он пугает

Почему мы НЕ МОЖЕМ объяснить магниты Ответ Фейнмана ломает мышление

Почему мы НЕ МОЖЕМ объяснить магниты Ответ Фейнмана ломает мышление

The Implosion of the Top Open Source Lab Qwen

The Implosion of the Top Open Source Lab Qwen

Эффект Джанибекова [Veritasium]

Эффект Джанибекова [Veritasium]

1 Hour of White Abstract Height Map Pattern Loop Animation | QuietQuests

1 Hour of White Abstract Height Map Pattern Loop Animation | QuietQuests

The World's Most Important Machine

The World's Most Important Machine

Как взламывают любой Wi-Fi без пароля?

Как взламывают любой Wi-Fi без пароля?

Why It Was Almost Impossible to Make the Blue LED

Why It Was Almost Impossible to Make the Blue LED

Distilled Iterative Value Conversion: Reinforcement Learning for Neurochip-Driven Edge Robots

Distilled Iterative Value Conversion: Reinforcement Learning for Neurochip-Driven Edge Robots

ICCO: Learning an Instruction-conditioned Coordinator for Language-guided Task-aligned Multi-robot

ICCO: Learning an Instruction-conditioned Coordinator for Language-guided Task-aligned Multi-robot

Электричество НЕ течёт по проводам — тревожное открытие Ричарда Фейнмана

Электричество НЕ течёт по проводам — тревожное открытие Ричарда Фейнмана

Почему река Лена - самая ЖУТКАЯ Река в Мире

Почему река Лена - самая ЖУТКАЯ Река в Мире

DAPPER: Discriminability-Aware Policy-to-Policy Preference-Based Reinforcement Learning

DAPPER: Discriminability-Aware Policy-to-Policy Preference-Based Reinforcement Learning