Samuel Ainsworth - Git Re-Basin: Merging Models modulo Permutation Symmetries

Автор: Columbia Vision Seminar

Загружено: 2023-01-26

Просмотров: 1479

Описание: January 26th, 2023. Columbia University

Abstract:

The success of deep learning is due in large part to our ability to solve certain massive non-convex optimization problems with relative ease. Though non-convex optimization is NP-hard, simple algorithms -- often variants of stochastic gradient descent -- exhibit surprising effectiveness in fitting large neural networks in practice. We argue that neural network loss landscapes contain (nearly) a single basin after accounting for all possible permutation symmetries of hidden units a la Entezari et al. (2021). We introduce three algorithms to permute the units of one model to bring them into alignment with a reference model in order to merge the two models in weight space. This transformation produces a functionally equivalent set of weights that lie in an approximately convex basin near the reference model. Experimentally, we demonstrate the single basin phenomenon across a variety of model architectures and datasets, including the first (to our knowledge) demonstration of zero-barrier linear mode connectivity between independently trained ResNet models on CIFAR-10 and CIFAR-100. Additionally, we identify intriguing phenomena relating model width and training time to mode connectivity. Finally, we discuss shortcomings of the linear mode connectivity hypothesis, including a counterexample to the single basin theory.

Bio:

Samuel Ainsworth is a Senior Research Scientist at Cruise AI Research where he studies imitation learning, robustness, and efficiency. He completed his undergraduate in Computer Science and Applied Mathematics at Brown University and received his PhD from the School of Computer Science and Engineering at the University of Washington. His research interests span reinforcement learning, deep learning, programming languages, and drug discovery. He has previously worked on recommender systems, Bayesian optimization, and variational inference at organizations such as The New York Times and Google.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Samuel Ainsworth - Git Re-Basin: Merging Models modulo Permutation Symmetries

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Git Re-Basin @ DLCT

Git Re-Basin @ DLCT

Shuang Li - Enabling Compositional Generalization of AI Systems

Shuang Li - Enabling Compositional Generalization of AI Systems

Удар по Темрюку, Трамп в Давосе, Адам идет на поправку. Галлямов, Касьянов, Филиппенко

Удар по Темрюку, Трамп в Давосе, Адам идет на поправку. Галлямов, Касьянов, Филиппенко

Понимание Z-преобразования

Понимание Z-преобразования

ЗАНИМАТЕЛЬНАЯ ВЕРОЯТНОСТЬ. ЛЕКЦИЯ 21.11.2025 В РАМКАХ ЛЕКТОРИЯ ВДНХ

ЗАНИМАТЕЛЬНАЯ ВЕРОЯТНОСТЬ. ЛЕКЦИЯ 21.11.2025 В РАМКАХ ЛЕКТОРИЯ ВДНХ

James Tompkin - More Cameras and Better Cameras

James Tompkin - More Cameras and Better Cameras

Introduction to Model Merging

Introduction to Model Merging

Hila Chefer - Transformer Explainability

Hila Chefer - Transformer Explainability

Рабочая музыка для глубокой концентрации и сверхэффективности

Рабочая музыка для глубокой концентрации и сверхэффективности

Positive Mood Jazz ☕ Cozy Winter Coffee Jazz Music and Sweet Bossa Nova Piano for Energy the day

Positive Mood Jazz ☕ Cozy Winter Coffee Jazz Music and Sweet Bossa Nova Piano for Energy the day

A Theory of the Mechanics of Information - Christopher Hazard

A Theory of the Mechanics of Information - Christopher Hazard

NAWROCKI I TRUMP: Ten sojusz przeraża Brukselę i polski rząd! | Gość Dzisiaj

NAWROCKI I TRUMP: Ten sojusz przeraża Brukselę i polski rząd! | Gość Dzisiaj

Five Steps to Create a New AI Model

Five Steps to Create a New AI Model

David Fouhey - Understanding 3D Rooms and Interacting Hands

David Fouhey - Understanding 3D Rooms and Interacting Hands

Angjoo Kanazawa - Towards Capturing Reality

Angjoo Kanazawa - Towards Capturing Reality

"Trzeba dbać o robotę” — rynek zaczyna dyktować warunki

LLM fine-tuning или ОБУЧЕНИЕ малой модели? Мы проверили!

LLM fine-tuning или ОБУЧЕНИЕ малой модели? Мы проверили!

QBC Seminar:Jan 13, 2025 with Dr. James Brown

QBC Seminar:Jan 13, 2025 with Dr. James Brown

Ускоренный курс LLM по тонкой настройке | Учебное пособие LLM по тонкой настройке

Ускоренный курс LLM по тонкой настройке | Учебное пособие LLM по тонкой настройке

Antoine Miech - Flamingo: a Visual Language Model for Few-Shot Learning

Antoine Miech - Flamingo: a Visual Language Model for Few-Shot Learning