What if we can control the State or Attention in the LLM?
Автор: Xiaol.x
Загружено: 2026-01-31
Просмотров: 168
Описание:
Beyond Prompting: State MoE & Residual, Rewiring LLM Brains on the Fly
What if you could download skills into an LLM like Neo in The Matrix?
Standard prompting is expensive. You pay for every token, every time. State Tuning offers a different path: optimizing the model's initial state to inject a "persona," "skill," or "safety guard" directly into its short-term memory—with zero inference overhead.
In this video, we visualize the architecture of State Residuals, exploring how we can adapt RNNs and Linear Transformers (like RWKV/Mamba) without retraining the backbone.
🚀 KEY CONCEPTS:
**State ControlNet**: Using a Zero-Bridge to safely steer the model.
**State MoE**: Dynamic compression capacity with Routers (Decay, Storage, Retrieval).
**State Tuning**: Treating h_0 as a trainable parameter (The "Matrix" Download).
**State-FFN Interaction**: Solving the "Blind FFN" problem using State Gating, State-Driven MoEs, and Hypernetworks.
TIMESTAMPS:
0:00 Intro: The Problem with Prompting
0:17 State ControlNet & Zero-Bridge
1:09 State Mixture of Experts (MoE)
2:03 PART 2: The "Matrix" Skill Download
2:20 Standard Prompting vs. State Tuning
3:08 The Mathematics (Optimizing Theta)
3:35 Applications: Safety Guards & Personas
4:05 PART 3: The FFN Bottleneck
4:37 Method 1: State Gating
5:05 Method 2: State-Driven MoE
5:30 Method 3: Hypernetworks / LoRA
TAGS:
#RWKV #LLM #MachineLearning #Manim #AI #StateTuning #Hypernetwork #LoRA #DeepLearning #NeuralNetworks #Mamba #RNN
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: