Ollama 0.19 MLX on Apple Silicon — 2x Faster, Fully Local
Автор: TechWealth Hub
Загружено: 2026-03-31
Просмотров: 843
Описание:
Ollama 0.19 rebuilt on Apple's MLX framework delivers 58% faster prefill and 2x decode speed on Apple Silicon M5 chips.
In this video I break down:
What MLX means for local AI on Mac
The real benchmark numbers (1810 vs 1154 tokens/s prefill)
NVFP4 quantization for production parity
How it connects to Conductor OSS for a fully local coding stack
Source: ollama.com/blog/mlx
Ollama: ollama.com
Try it: ollama launch openclaw --model qwen3.5-35b-a3b-coding-nvfp4
Powered by Conductor OSS — github.com/charannyk06/conductor-oss
conductross.com
Works with Claude Code, Codex, Gemini CLI, and OpenCode — all fully local.
Timestamps:
00:00 Conductor OSS Intro
00:26 Ollama 0.19 MLX Hook
01:22 The Local AI Tradeoff
02:30 MLX Engine Deep Dive
03:48 Live Demo
04:57 Benchmark Results
06:24 Full Local Stack
08:02 Repo Showcase
08:54 Outro
Want us to build your AI product? - https://docs.google.com/forms/d/e/1FA...
Subscribe for daily AI news and analysis!
#Ollama #AppleSilicon #MLX #LocalAI #ConductorOSS #CodingAgents #TechWealthHub #AI
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: