Wyamo research - EMMA model (Oct'2024): self driving car that thinks in words
Автор: Kevin Lui
Загружено: 2025-12-13
Просмотров: 23
Описание: The video introduces EMMA (End-to-End Multimodal Model for Autonomous Driving - https://arxiv.org/pdf/2410.23262), a system built upon a foundation model like Google's Gemini that treats the Large Language Model (LLM) as a central component. EMMA is designed as a generalist model that processes raw camera video and textual commands to directly produce outputs for multiple driving tasks, including motion planning, 3D object detection, road graph estimation, and scene understanding. A key feature of EMMA is the integration of chain-of-thought reasoning, which enhances both the model's performance and its ability to explain its driving rationale by articulating its decisions. The research demonstrates that co-training EMMA on multiple tasks can improve performance across individual tasks, although the authors acknowledge limitations such as the current lack of native LiDAR/radar input fusion and the computational challenges associated with deploying large models in real-time autonomous systems.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: