ycliper

Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
Скачать

On Merging and MoErging Models and Modules - Ivan Vulić (University of Cambridge / Google DeepMind)

Автор: HiTZ zentroa

Загружено: 2025-12-23

Просмотров: 50

Описание: Summary:

Despite recent tendencies towards building large "monolithic" neural models, fine-tuned expert models and parameter-efficient specialised modules still offer gains over large monoliths in specific tasks and for specific data distributions (e.g., low-resource languages or specialised domains). Moreover, such modularisation of skills and expertise into dedicated models or modules allows for asynchronous, decentralised, and more efficient continuous model development, as well as module reusability. However, a central question remains: how to combine and compose these modules to enable positive transfer, sample-efficient learning, and improved out-of-domain generalisation. In this talk, after discussing the key advantages of modularisation and modular specialisation, I will provide an overview of prominent module and model composition strategies. I will focus on composition at the parameter level (model merging) and functional level (model MoErging), and then illustrate the usefulness of these techniques across several applications.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...
On Merging and MoErging Models and Modules - Ivan Vulić (University of Cambridge / Google DeepMind)

Поделиться в:

Доступные форматы для скачивания:

Скачать видео

  • Информация по загрузке:

Скачать аудио

Похожие видео

Incorporating Commonsense Reasoning into NLP Models (Vered Shwartz)

Incorporating Commonsense Reasoning into NLP Models (Vered Shwartz)

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Prompting is *not* all you need! Or why Multi-LLM Collaboration Matters-Mirella Lapata (Edin)

Prompting is *not* all you need! Or why Multi-LLM Collaboration Matters-Mirella Lapata (Edin)

4 Hours Chopin for Studying, Concentration & Relaxation

4 Hours Chopin for Studying, Concentration & Relaxation

Stefan Nastic: Emerging Computing Paradigms for a Sustainable Next-Generation Computing Landscape

Stefan Nastic: Emerging Computing Paradigms for a Sustainable Next-Generation Computing Landscape

The Exoplanet Revolution - Professor Didier Queloz, University of Cambridge

The Exoplanet Revolution - Professor Didier Queloz, University of Cambridge

Deep Dive into LLMs like ChatGPT

Deep Dive into LLMs like ChatGPT

xCOMET,Tower,EuroLLM: Open & Multilingual LLMs for Europe-André F. T. Martins~Universidade de Lisboa

xCOMET,Tower,EuroLLM: Open & Multilingual LLMs for Europe-André F. T. Martins~Universidade de Lisboa

MIT 6.S191: Convolutional Neural Networks

MIT 6.S191: Convolutional Neural Networks

Speech neuroprostheses based on intracranial EEG - Christian Herff (Maastricht University)

Speech neuroprostheses based on intracranial EEG - Christian Herff (Maastricht University)

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

Cryo-EM Data Collection - Giuseppe Cannone

Cryo-EM Data Collection - Giuseppe Cannone

Как крутят нейронки на периферийных устройствах / База по Edge Computing от инженера из Qualcomm

Как крутят нейронки на периферийных устройствах / База по Edge Computing от инженера из Qualcomm

Professor Holger Babinsky: Surging cylinders, flapping wings and gust encounters...

Professor Holger Babinsky: Surging cylinders, flapping wings and gust encounters...

The real reason Elon Musk bought Twitter | Yanis Varoufakis on the future of capitalism

The real reason Elon Musk bought Twitter | Yanis Varoufakis on the future of capitalism

Как Сделать Настольный ЭЛЕКТРОЭРОЗИОННЫЙ Станок?

Как Сделать Настольный ЭЛЕКТРОЭРОЗИОННЫЙ Станок?

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 1: Overview and Tokenization

Stanford CS336 Language Modeling from Scratch | Spring 2025 | Lecture 1: Overview and Tokenization

Проектирование системы WHATSAPP: системы чат-сообщений для собеседований

Проектирование системы WHATSAPP: системы чат-сообщений для собеседований

[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han

[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han

Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

Stanford CS25: V1 I Mixture of Experts (MoE) paradigm and the Switch Transformer

© 2025 ycliper. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]