VGGT-Long: Extending 3D Foundation Models to kilometer-scale RGB sequences
Автор: dengkaicq
Загружено: 2025-10-06
Просмотров: 417
Описание:
Extending 3D Foundation Models to large-scale RGB sequence-based 3D scene reconstruction remains constrained and challenging. In this work, we propose VGGT-Long, a simple yet effective system that pushes the limits of monocular 3D foundation models to kilometer-scale, unbounded outdoor environments. Our approach addresses the scalability bottlenecks of existing models through a chunk-based processing strategy combined with overlapping alignment and lightweight loop closure optimization. The method only requires RGB input and does not need camera intrinsic calibration. VGGT-Long achieves trajectory and reconstruction performance comparable to traditional methods. It not only runs successfully on long RGB sequences where 3D foundation models typically fail, but also produces accurate and consistent geometry under various conditions. Our results highlight the potential of leveraging foundation models for scalable monocular 3D scene modeling in real-world settings, especially for autonomous driving in long-sequence scenarios. The code is open-sourced on GitHub under the same repository name.
https://github.com/DengKaiCQ/VGGT-Long
VGGT-Long-720p-30fps-HQ-H264-mu.mp4
#AI #artificialintelligence #3D #ComputerScience #ComputerVision #SLAM #SfM #github #AutonomousDriving
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: