Atrass#7 : A multistream multimodal foundation model for real-time voice-based applications

Автор: European Trustworthy AI Association

Загружено: 2025-10-01

Просмотров: 41

Описание: By Patrick Perez, Kyutai, France

A unique way for humans to seamlessly exchange information and emotion, speech should be a key means for us to communicate with and through machines. This is not yet the case. In an effort to progress toward this goal, we introduce a versatile speech-text decoder-only model that can serve a number of voice-based applications. It has in particular allowed us to
build Moshi, the first-ever full-duplex spoken-dialogue system (with no latency and no imposed speaker turns) as well as Hibiki, the first simultaneous voice-to-voice translation model with voice preservation to run on a mobile phone. This multistream multimodal model can also be
turned into a visual-speech model (VSM) via cross-attention with visual information, which allows Moshi to freely discuss about an image while maintaining its natural conversation style and low latency. This talk will provide an illustrated tour of this research.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Atrass#7 : A multistream multimodal foundation model for real-time voice-based applications

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Atrass #7: Engineering Safe and Socially-Aware AI Systems

Atrass #7: Engineering Safe and Socially-Aware AI Systems

Atrass#8: Control strategy for actuator fault tolerance using deep reinforcement learning

Atrass#8: Control strategy for actuator fault tolerance using deep reinforcement learning

ATRASS Seminar #5: safe.trAIn – Safe AI for driverless regional trains

ATRASS Seminar #5: safe.trAIn – Safe AI for driverless regional trains

Conversation with Elon Musk | World Economic Forum Annual Meeting 2026

Conversation with Elon Musk | World Economic Forum Annual Meeting 2026

Reinforcement Learning for Language Models

Reinforcement Learning for Language Models

ПОСЛЕДНИЙ Выбор ЧЕЛОВЕЧЕСТВА | Либерманы

ПОСЛЕДНИЙ Выбор ЧЕЛОВЕЧЕСТВА | Либерманы

Chillout Lounge Radio - 24/7 Live | Smooth Background Music | Focus, Study, Work, Sleep, Meditation

Chillout Lounge Radio - 24/7 Live | Smooth Background Music | Focus, Study, Work, Sleep, Meditation

AI Trustworthiness and Risk Assessment Scientific Seminar Series

AI Trustworthiness and Risk Assessment Scientific Seminar Series

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Atrass#9: Assessing and Mitigating Privacy Risks of Vision-Language Models

Atrass#9: Assessing and Mitigating Privacy Risks of Vision-Language Models

ATRASS Seminar #5: An inductive modeling of distribution shifts enhances trustworthy ai

ATRASS Seminar #5: An inductive modeling of distribution shifts enhances trustworthy ai

Smooth Jazz & Soul R&B 24/7 – Soul Flow Instrumentals

Smooth Jazz & Soul R&B 24/7 – Soul Flow Instrumentals

ATRASS Seminar #6: Trustworthy AI based on Analytical-Model Informed Machine Learning

ATRASS Seminar #6: Trustworthy AI based on Analytical-Model Informed Machine Learning

Успокаивающая музыка для нервов 🌿 лечебная музыка для сердца и сосудов, релакс, музыка для души #289

Успокаивающая музыка для нервов 🌿 лечебная музыка для сердца и сосудов, релакс, музыка для души #289

🔥 Европа ВОЕТ! ЕС рухнет в ближайшие годы. Экономике ХАНА!

🔥 Европа ВОЕТ! ЕС рухнет в ближайшие годы. Экономике ХАНА!

Claude Code Ends SaaS, the Gemini + Siri Partnership, and Math Finally Solves AI | #224

Claude Code Ends SaaS, the Gemini + Siri Partnership, and Math Finally Solves AI | #224

2026: Всё Уже Решено - Вот Что Будет Дальше

2026: Всё Уже Решено - Вот Что Будет Дальше

Компания Salesforce признала свою ошибку.

Компания Salesforce признала свою ошибку.

ATRASS #9: On the Trustworthy AI Dimensions Fairness, Confidentiality and Transparency

ATRASS #9: On the Trustworthy AI Dimensions Fairness, Confidentiality and Transparency

ATRASS Seminar #6: Trustproofer

ATRASS Seminar #6: Trustproofer