ycliper

Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
Скачать

My Preferred DIY Language Model Stack

Автор: Cruz Macias

Загружено: 2025-11-26

Просмотров: 12

Описание: Using Sora 2 to read from the Recent Projects and Notable Accomplishments section of my CV verbatim in the style of AI YouTuber / Grad Student.

Is this a valid interviewing format?

OpenWebUI & and Local LLM Deployment/Integration
Maintaining and expanding the OpenWebUI platform for Generative AI web integration, locally hosted small language models (SLMs) and cloud-based large language models (LLMs) via a unified architecture. See my preferred DIY Language Model-Stack(s) below which include RAG/Embedding models, web search tools, OCR, and reranking models. Key components include:

Models/Tools: Integrated diverse models (e.g., GLM, Mistral, Perplexity, GGUF) and tools (llama.cpp, Ollama, LMStudio, LangChain) with custom middleware.
Stack: Python (FastAPI, REST), cURL, WebSocket streaming, async/await, Docker, and Python venv/uv.
Features: Retrieval-augmented generation (RAG), embedding, reranking pipelines, Model Context Protocol (MCP) servers, multi-agent systems, and privacy-preserving protocols like Role-Pseudonymous Prompting (RPP).
Hardware Preferences:
GPU: GGUF Models hosted on Llama.cpp servers (CUDA release) via HuggingFace repos or Ollama, optimized for NVIDIA RTX 5070 TI GPU (12GB VRAM).
CPU: GGUF Models hosted on Llama.cpp servers (CPU release) via HuggingFace repos, optimized for AMD Ryzen 9 8940HX.
Benchmarks: Achieved parity with commercial models at significantly lower costs (up to 94.4% reduction) in domain-specific benchmarks, validating scalable and enterprise-grade performance.
Applications: Optimized for specialized contexts (e.g., research, analysis), ensuring secure, compliant, and cost-efficient generative AI architectures.

Preferred DIY Language Model-Stack(s):
Platform Integration
OpenWebUI
Continue
Ollama
llama.cpp
My Profiles
HuggingFace
Continue
Ollama
OpenWebUI
Cloud
Instruct Model(s) Chat:
GLM-4.5-Flash
Magistral-Small
Mistral-Nemo-12B
Cohere-Command-R-7B
Groq-GPT-OSS-20B
Gemini-2.5-Flash
Base Model(s) Code:
Devstral-Small
Web Search Model:
Perplexity Search API
Groq-Compound-Mini
OCR Model:
Mistral OCR
Embedding Model:
Cohere-Embed
Codestral-Embed
Mistral-Embed
Reranking Model:
Cohere-Rerank
Local
Instruct Model(s) Chat:
LiquidAI/LFM2-8B-A1B-GGUF
google/gemma-7b-GGUF
command-r7b:7b
gemma3:4b
deepseek-r1:8b
qwen3:8b
llama3.1:8b
mistral:7b
dolphin3:8b
dolphin-llama3:8b
Base Model(s) Code:
google/codegemma-7b-GGUFp
LiquidAI/LFM2-1.2B-Tool-GGUF
LiquidAI/LFM2-350M-Math-GGUF
codegemma:7b
deepcoder:1.5b
deepseek-coder:6.7b
llama3-groq-tool-use:8b
qwen2.5-coder:7b
RAG Local Model:
LiquidAI/LFM2-1.2B-RAG-GGUF
Embedding Model:
leliuga/all-MiniLM-L12-v2-GGUF
LiquidAI/LFM2-1.2B-Extract
embeddinggemma:300m
all-MiniLM-L6-v2:22m
all-minilm:33m
Reranking Model:
gpustack/bge-reranker-v2-m3-GGUF
bge-reranker-v2-m3:600m
bge-m3:567m

Content not intended for SEO-Spam/spamdexing/AI Slop or otherwise — solely to spread the Gospel of Jesus Christ through musicianship, art, technology, and media.

Prompts Powered by GPT-5.1

For a complete Statement of Copyright Protection and Limitation and Liability Statement, please visit,
https://cmathgit.github.io/cruzgmacia...

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...
My Preferred DIY Language Model Stack

Поделиться в:

Доступные форматы для скачивания:

Скачать видео

  • Информация по загрузке:

Скачать аудио

Похожие видео

© 2025 ycliper. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]