How Vik Built Moondream—A Tiny Vision Model with Big Power
Автор: AI Tinkerers
Загружено: 2025-03-10
Просмотров: 1901
Описание:
Vik from Moondream AI joins Joe to demo a vision-language model that runs locally—on your laptop, your phone, even a Raspberry Pi.
From visual question answering to gaze detection and UI automation, Vik shows how Moondream is redefining edge computer vision—no cloud required.
Whether you're into robotics, home automation, or lightweight AI, this “One-Shot” is packed with insights for builders.
Try it yourself at moondream.ai 🚀
00:00 - Intro to Moondream’s compression tech for 2B parameter models
00:22 - Joe welcomes Vik from Moondream
01:53 - Shift from traditional CV to promptable vision-language models
03:23 - Playground demo: Visual Question Answering (VQA)
04:42 - VQA demo results: speed, structure, and accuracy
05:03 - Object detection, pointing, and captioning demos
07:57 - Prompts that push reasoning: uniform detection
10:07 - Cross-task benefits: gaze detection improves directional reasoning
11:02 - Comparing Moondream’s VQA to Quinn’s visual reasoning model
13:21 - Why edge deployment still matters in vision
15:21 - 0.5B model runs on Raspberry Pi using 816MB with int4
16:15 - HAL 2000 setup: Moondream + Tiny LLaMA + Coqui TTS
20:55 - Texas rancher uses drone and Moondream for cow detection
21:53 - Commercial use: air-gapped environments like retail, safety
23:09 - UI automation and button detection with pointing feature
29:41 - Vision for ambient agents and local inference
30:27 - Power efficiency: 10x less energy than 7B/20B cloud models
31:01 - Moondream API & Hugging Face transformers integration
36:11 - Vik’s background: From AWS to machine learning
40:56 - Discovering AI Tinkerers and global meetups
#AITinkerers #MoondreamAI #EdgeAI #ComputerVision #LLM #OpenSource #OneShot
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: