Is SFT Dead? How Meta only uses 13 Parameters to Learning to Reason
Автор: AI fun facts for all
Загружено: 2026-02-15
Просмотров: 80
Описание:
What if I told you that a 7B model like Qwen2.5-7B could jump from 76% to 91% accuracy… using just 26 bytes of trainable data?
That’s smaller than a tweet.
In this video, we break down Meta’s groundbreaking paper, Meta’s “Learning to Reason in 13 Parameters.” We explore how TinyLoRA challenges everything we thought we knew about fine-tuning large language models.
We’ll cover:
Why 13 parameters can outperform traditional LoRA setups
Why Reinforcement Learning (GRPO) crushes Supervised Fine-Tuning for reasoning
The shocking “Inverse Scaling Law” that suggests bigger models may need less training
If you care about alignment, local LLMs, or the future of AI efficiency, this one will rewire how you think about model steering.
Join my AI newsletter:
https://upaspro.com/newsletter/
More information: https://upaspro.com/is-sft-dead-how-m...
👇 Timestamps:
00:00- Train Qwen 2.5 with 26 bytes
02:02- 1- Myth of Capacity
04:13- 2- Signal-to-Noise Ratio
06:28- 3- Inverse Scaling Law
07:09- Recap
#AI #LLM #TinyLoRA #ReinforcementLearning #MetaAI #MachineLearning #OpenSourceAI #Alignment #Qwen #DeepLearning
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: