NVIDIA DGX Spark: From “Inference Box” to Dev Rig (What It Actually Is) | Ep 2

Автор: Domesticating AI

Загружено: 2026-02-13

Просмотров: 2694

Описание: Everyone keeps calling NVIDIA DGX Spark an “inference box”… but in practice, it behaves more like a dev rig.
In Ep 2 of Domesticating AI, we break down what Spark is actually good for (AI dev + fine-tuning) vs what it isn’t (a magical drop-in inference server), and why unified memory changes the whole experience for local AI.

What we cover

Training vs inference (and why “inference server” gets misused)

What unified memory changes for model loading + workflows

The “gateway stack”: Ollama + Open WebUI

When you outgrow turnkey UIs and need more control (sampling, behavior, workflows)

Why Spark shines for fine-tuning (QLoRA/Unsloth-style workflows)

Homelab reality: Docker “recipes,” troubleshooting, and why you need friends

Remote access done safely: Tailscale

Cloud vs home economics (when cloud is cheaper… and when it explodes)

Why accelerator workloads can be painful in “Kubernetes everything” land

Links & Resources
NVIDIA / DGX Spark

DGX Spark product page: https://www.nvidia.com/en-us/products...

Start building on Spark (recipes + docs hub): https://build.nvidia.com/spark

NIM on Spark (playbook): https://build.nvidia.com/spark/nim-llm

Local AI runners + UIs

Ollama: https://ollama.com/

Open WebUI (GitHub): https://github.com/open-webui/open-webui

Open WebUI docs: https://docs.openwebui.com/

llama.cpp: https://github.com/ggml-org/llama.cpp

LM Studio: https://lmstudio.ai/

vLLM: https://github.com/vllm-project/vllm

Jan: https://jan.ai/

Image / Workflow tools

ComfyUI: https://github.com/Comfy-Org/ComfyUI

AUTOMATIC1111 SD WebUI: https://github.com/AUTOMATIC1111/stab...

Unsloth: https://github.com/unslothai/unsloth

Networking / Remote access

Tailscale: https://tailscale.com/

Cloud GPU alternatives (mentioned)

Runpod pricing: https://www.runpod.io/pricing

Modal pricing: https://modal.com/pricing

Hosts

Miriah Peterson (Host): Miriah Peterson is a software engineer, Go educator, and community builder focused on production-first AI—treating LLM systems like real software with real users. She runs SoyPete Tech (streams + writing + open-source projects) and stays active in the Utah dev community through meetups and events, with a practical focus on shipping local and cloud AI systems.
Connect:

SoyPete Tech (YouTube):    / @soypete_tech

SoyPete Tech (Substack): https://soypetetech.substack.com/

LinkedIn:   / miriah-peterson-35649b5b

Matt Sharp (Host): Matt Sharp is an AI Engineer and Strategist for a tech consulting firm and co-author of LLMs in Production. He’s a recovering data scientist and MLOps expert with 10+ years of experience operationalizing ML systems in production. Matt also teaches a graduate-level MLOps-in-production course at Utah State University as an adjunct professor. You can find him on Substack (Data Pioneer), LinkedIn, and on his other podcast, the Learning Curve.
Connect:

Data Pioneer (Substack): https://thedatapioneer.substack.com/

Chris Brousseau (Host): Chris Brousseau is a linguist by training and an NLP practitioner by trade, with a career spanning linguistically informed NLP, modern LLM systems, and MLOps practices. He’s co-author of LLMs in Production and is currently VP of AI at VEOX. You can find him as IMJONEZZ (two Z’s) on YouTube, GitHub, and on LinkedIn.
Connect:

YouTube (IMJONEZZ):    / @imjonezz

LinkedIn:   / en

📘 LLMs in Production (Matt Sharp & Chris Brousseau): https://www.manning.com/books/llms-in...

Subscribe + Community

If you’re building local AI at home (or trying to), drop your setup in the comments:
GPU/CPU | RAM | Runner (Ollama/llama.cpp/vLLM) | Model + quant | Use case

And don’t forget to like, subscribe, and comment — it helps the show a ton.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

NVIDIA DGX Spark: From “Inference Box” to Dev Rig (What It Actually Is) | Ep 2

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

John Carmack Was Right. The Internet Was Wrong.

John Carmack Was Right. The Internet Was Wrong.

Этот суперкомпьютер на основе искусственного интеллекта может поместиться на вашем столе...

Этот суперкомпьютер на основе искусственного интеллекта может поместиться на вашем столе...

Same 128GB but cheaper

Same 128GB but cheaper

The Real Reason Anthropic built a Compiler

The Real Reason Anthropic built a Compiler

Секретный инструмент Google для программирования стал бесплатным (подробный обзор Gemini CLI)

Секретный инструмент Google для программирования стал бесплатным (подробный обзор Gemini CLI)

Your RAM Is Fake. The Moon Broke Timezones. And Your Compiler Is Guessing.

Your RAM Is Fake. The Moon Broke Timezones. And Your Compiler Is Guessing.

I Ran a Trillion Parameter AI on a Mac... Here’s the Secret

I Ran a Trillion Parameter AI on a Mac... Here’s the Secret

Your First AI at Home

Your First AI at Home

Building the PERFECT Linux PC with Linus Torvalds

Building the PERFECT Linux PC with Linus Torvalds

Ethernet is DEAD?? Mac Studio is 100x FASTER!!

Ethernet is DEAD?? Mac Studio is 100x FASTER!!

NVIDIA CEO Jensen Huang's Vision for the Future

NVIDIA CEO Jensen Huang's Vision for the Future

OpenClaw Creator: Почему 80% приложений исчезнут

OpenClaw Creator: Почему 80% приложений исчезнут

Состояние агентного программирования #3 с Армином и Беном

Состояние агентного программирования #3 с Армином и Беном

Не создавайте агентов, а развивайте навыки – Барри Чжан и Махеш Мураг, Anthropic

Не создавайте агентов, а развивайте навыки – Барри Чжан и Махеш Мураг, Anthropic

The Rise of Chinese Memory

The Rise of Chinese Memory

This G.Skill Memory Won't Throttle? And ASUS has a Bug

This G.Skill Memory Won't Throttle? And ASUS has a Bug

Inside InfusionPoints Development: Command Center, FedRAMP 20x & Hackathon Builds

Inside InfusionPoints Development: Command Center, FedRAMP 20x & Hackathon Builds

Моя команда из нескольких агентов с OpenClaw

Моя команда из нескольких агентов с OpenClaw

Biggest Breakthroughs in Computer Science: 2025

Biggest Breakthroughs in Computer Science: 2025

NVIDIA Killer Is Here (17000 Tokens Per Second)!

NVIDIA Killer Is Here (17000 Tokens Per Second)!