LLaDA2.0 100B Diffusion Language Model: AR to dLLM Conversion & Scalable Training
Автор: AITech_Trends
Загружено: 2025-12-20
Просмотров: 41
Описание:
This video covers the LLaDA2.0 paper that introduces a scalable paradigm converting traditional autoregressive language models into discrete diffusion LLMs with a novel training pipeline.
📌 Three-phase training strategy (Warmup-Stable-Decay) for efficient AR→dLLM transformation
📌 Open-sourced LLaDA2.0-mini (16B) and LLaDA2.0-flash (100B) with optimized performance
📌 Benefits of parallel decoding and practical deployment considerations
#DiffusionModel #LLaDA2 #LargeLanguageModels #AIResearch
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: