FlexOlmo: Open Language Models for Flexible Data Use
Автор: Simons Institute for the Theory of Computing
Загружено: 2026-02-25
Просмотров: 36
Описание:
Sewon Min (UC Berkeley)
https://simons.berkeley.edu/talks/sew...
Learning from Heterogeneous Sources
Large language models are often limited by data, especially when valuable datasets are distributed across institutions or cannot be shared. We introduce FlexOlmo, a new class of Mixture-of-Experts (MoE) models designed for flexible, modular data use. In FlexOlmo, expert modules are trained independently on separate datasets and later merged seamlessly into a single model. This enables distributed training without data sharing, supports the use of closed datasets, and allows data to be opt-in or opt-out at inference time. We scale FlexOlmo to 37B parameters (20B active) and evaluate on 31 diverse downstream tasks. FlexOlmo significantly outperforms models trained on public data only and approaches the performance of an upper-bound model trained on all datasets. By enabling modular integration of closed data while respecting data ownership and control, FlexOlmo offers a practical path toward collaborative, continuous model development.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: