The Unreasonable Effectiveness of Reasoning Distillation: using DeepSeek R1 to beat OpenAI o1
Автор: Latent Space
Загружено: 2025-01-24
Просмотров: 15755
Описание:
https://www.bespokelabs.ai/blog/bespo...
We trained Bespoke-Stratos-32B, our reasoning model distilled from DeepSeek-R1 using Berkeley NovaSky’s Sky-T1 data pipeline. The model outperforms Sky-T1 and o1-preview in reasoning (Math and Code) benchmarks and almost reaches the performance of DeepSeek-R1-Distill-Qwen-32B while being trained on 47x fewer examples:
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: