Charlie Gerard - Multi-Modal AI on the Web
Автор: CascadiaJS
Загружено: 2025-10-13
Просмотров: 217
Описание:
🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲
Tickets are ON SALE for CascadiaJS 2026 - https://cascadiajs.com/2026
🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲🌲
What if you could use multimodal LLMs to interact with websites or IoT devices using motion control? As advancements in multimodal AI offer new opportunities to push the boundaries of what can be done with this technology, I started wondering how it could be leveraged from the perspective of human-computer interaction. In this talk, I will take you through my research experimenting with building motion-controlled prototypes using LLMs in JavaScript.
00:00 Introduction & Speaker Background
02:01 The Value of Conferences & Making Connections
03:54 Motion Control with Multimodal AI: The Core Question
04:32 Inspiration: The Room E Project & Context-Aware Systems
06:38 What is Multimodal AI?
07:19 Project Goal: Controlling Devices with Hand Gestures & Gemini
07:48 Approaches to Motion Control with LLMs
09:31 Demo 1: Gesture Detection with Gemini
12:58 Demo 2: Function Calling – Toggling a Light with Gestures
16:11 Demo 3: Multi-Turn Function Calling for Multiple Lights
20:04 Demo 4: Combining Gemini with TensorFlow.js for Color Control
23:27 Demo 5: Custom Gestures with Vector Embeddings & Vector Databases
28:12 Research, Resources & Future Directions
29:43 Final Thoughts & Q&A
Повторяем попытку...

Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: