ycliper

Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
Скачать

Antoine Miech - Flamingo: a Visual Language Model for Few-Shot Learning

Автор: Columbia Vision Seminar

Загружено: 2022-10-06

Просмотров: 911

Описание: October 6th, 2022. Columbia University

Abstract:

Building models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research. In this talk, I will introduce Flamingo, a family of Visual Language Models (VLM) with this ability. We propose key architectural innovations to: (i) bridge powerful pretrained vision-only and language-only models, (ii) handle sequences of arbitrarily interleaved visual and textual data, and (iii) seamlessly ingest images or videos as inputs. Thanks to their flexibility, Flamingo models can be trained on large-scale multimodal web corpora containing arbitrarily interleaved text and images, which is key to endow them with in-context few-shot learning capabilities. We perform a thorough evaluation of our models, exploring and measuring their ability to rapidly adapt to a variety of image and video tasks. These include open-ended tasks such as visual question-answering, where the model is prompted with a question which it has to answer, captioning tasks, which evaluate the ability to describe a scene or an event, and close-ended tasks such as multiple-choice visual question-answering. For tasks lying anywhere on this spectrum, a single Flamingo model can achieve a new state of the art with few-shot learning, simply by prompting the model with task-specific examples. On numerous benchmarks, Flamingo outperforms models fine-tuned on thousands of times more task-specific data.

Bio:

Antoine Miech is a Senior Research Scientist at DeepMind. He completed his computer vision Ph.D. at Inria and Ecole Normale Supérieure, working with Dr. Ivan Laptev and Dr. Josef Sivic. His main research interests are vision and language understanding and is mostly known for his work around HowTo100M. Prior to joining DeepMind, he was awarded the Google Ph.D. fellowship in 2018 for his initial work on video understanding.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...
Antoine Miech - Flamingo: a Visual Language Model for Few-Shot Learning

Поделиться в:

Доступные форматы для скачивания:

Скачать видео

  • Информация по загрузке:

Скачать аудио

Похожие видео

Трамп передал послание Путину. Нападение на школу в Татарстане. Франция задержала танкер России

Трамп передал послание Путину. Нападение на школу в Татарстане. Франция задержала танкер России

⚔️ДАВОССКОЕ ПОБОИЩЕ: у Зе РАУНД! Совет МИРА ЧЕРЕЗ СИЛУ. Кусок льда Трампа на денги Путина - Латынина

⚔️ДАВОССКОЕ ПОБОИЩЕ: у Зе РАУНД! Совет МИРА ЧЕРЕЗ СИЛУ. Кусок льда Трампа на денги Путина - Латынина

The World's Most Important Machine

The World's Most Important Machine

Pascal Mettes - Hyperbolic and Hyperspherical Visual Understanding

Pascal Mettes - Hyperbolic and Hyperspherical Visual Understanding

Гренландия: остров китов, нищеты и алкоголизма | Интервью с местными, снег, лед и хаски

Гренландия: остров китов, нищеты и алкоголизма | Интервью с местными, снег, лед и хаски

Sweet Winter Morning Jazz  ~ Gentle January Coffee Music & Bossa Nova Instrumental to Great Mood

Sweet Winter Morning Jazz ~ Gentle January Coffee Music & Bossa Nova Instrumental to Great Mood

Архитектор разбирает дизайн четырёх знаковых музеев Нью-Йорка | Architectural Digest

Архитектор разбирает дизайн четырёх знаковых музеев Нью-Йорка | Architectural Digest

4 Hours Chopin for Studying, Concentration & Relaxation

4 Hours Chopin for Studying, Concentration & Relaxation

Hila Chefer - Transformer Explainability

Hila Chefer - Transformer Explainability

Забрать Гренландию, забыть Украину? Трамп в Давосе | Спецэфир Русской службы Би-би-си

Забрать Гренландию, забыть Украину? Трамп в Давосе | Спецэфир Русской службы Би-би-си

ТИТУЛЬНЫЙ ЧЕТВЕРГ!! Играет СЕРГЕЙ ЖИГАЛКО и ТОПЫ МИРА! Шахматы. На Chess.com

ТИТУЛЬНЫЙ ЧЕТВЕРГ!! Играет СЕРГЕЙ ЖИГАЛКО и ТОПЫ МИРА! Шахматы. На Chess.com

AI Art: How artists are using and confronting machine learning | HOW TO SEE LIKE A MACHINE

AI Art: How artists are using and confronting machine learning | HOW TO SEE LIKE A MACHINE

Shuang Li - Enabling Compositional Generalization of AI Systems

Shuang Li - Enabling Compositional Generalization of AI Systems

The Art Of Architecture: Season 1 Episode 1 - New York Transportation Hub

The Art Of Architecture: Season 1 Episode 1 - New York Transportation Hub

Разработка с помощью Gemini 3, AI Studio, Antigravity и Nano Banana | Подкаст Agent Factory

Разработка с помощью Gemini 3, AI Studio, Antigravity и Nano Banana | Подкаст Agent Factory

Как Сделать Настольный ЭЛЕКТРОЭРОЗИОННЫЙ Станок?

Как Сделать Настольный ЭЛЕКТРОЭРОЗИОННЫЙ Станок?

Angjoo Kanazawa - Towards Capturing Reality

Angjoo Kanazawa - Towards Capturing Reality

Yaoyao Liu - Learning from Imperfect Data: Incremental Learning and Few-shot Learning

Yaoyao Liu - Learning from Imperfect Data: Incremental Learning and Few-shot Learning

Wei-Chiu Ma - Learning in-the-wild 3D Modeling and Simulation

Wei-Chiu Ma - Learning in-the-wild 3D Modeling and Simulation

Jiajun Wu - Understanding the Visual World Through Code

Jiajun Wu - Understanding the Visual World Through Code

© 2025 ycliper. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]