VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers

Автор: Cognitive AI

Загружено: 2022-04-06

Просмотров: 2734

Описание: VL-InterpreT was accepted to CVPR 2022.

Paper: https://arxiv.org/abs/2203.17247

Demo: http://vlinterpretenv4env-env.eba-vmh...

VL-InterpreT provides novel interactive visualizations for interpreting the attention and hidden representations in multimodal transformers. It is a task agnostic and integrated tool that (1) tracks a variety of statistics in attention heads throughout all layers for both vision and language components, (2) visualizes cross-modal and intra-modal attentions through easily readable heatmaps, and (3) plots the hidden representations of vision and language tokens as they pass through the transformer layers. In this paper, we demonstrate the functionalities of VL-InterpreT through the analysis of KD-VLP, an end-to-end pretraining vision-language multimodal transformer-based model, in the tasks of Visual Commonsense Reasoning (VCR) and WebQA, two visual question answering benchmarks. Furthermore, we also present a few interesting findings about multimodal transformer behaviors that were learned through our tool.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Краткое руководство по Vision Transformer — теория и код за (почти) 15 минут

Краткое руководство по Vision Transformer — теория и код за (почти) 15 минут

SpaceX rolls Flight 12's Booster 19 to Pad 2

SpaceX rolls Flight 12's Booster 19 to Pad 2

Изображение стоит 16x16 слов: Трансформеры для масштабного распознавания изображений (с пояснения...

Изображение стоит 16x16 слов: Трансформеры для масштабного распознавания изображений (с пояснения...

Cross Attention | Method Explanation | Math Explained

Cross Attention | Method Explanation | Math Explained

Stable Diffusion - How to build amazing images with AI

Stable Diffusion - How to build amazing images with AI

Vectors of Cognitive AI: Attention

Vectors of Cognitive AI: Attention

Race Highlights | 2026 Australian Grand Prix

Race Highlights | 2026 Australian Grand Prix

Переживи 30 Дней В Дикой Природе Со Своим Бывшим Партнёром, Выиграй $250,000

Переживи 30 Дней В Дикой Природе Со Своим Бывшим Партнёром, Выиграй $250,000

Enigma Style 2026 | RADIO 24/7 LIVE Stream | New Age Relaxing Music | Original by Albert Van Deyk

Enigma Style 2026 | RADIO 24/7 LIVE Stream | New Age Relaxing Music | Original by Albert Van Deyk

Только 20 ракет в день: уничтожение пусковых установок Ирана. Военный обзор Юрия Федорова

Только 20 ракет в день: уничтожение пусковых установок Ирана. Военный обзор Юрия Федорова

Swin Transformer paper animated and explained

Swin Transformer paper animated and explained

Vectors of Cognitive AI: Self-Organization

Vectors of Cognitive AI: Self-Organization

Рабочая музыка для глубокой концентрации и сверхэффективности

Рабочая музыка для глубокой концентрации и сверхэффективности

TILOS Seminar: Neuromorphic LLMs

TILOS Seminar: Neuromorphic LLMs

NotebookLM на максималках. Как изучать всё быстрее чем 99% пользователей

NotebookLM на максималках. Как изучать всё быстрее чем 99% пользователей

Smooth Jazz & Soul R&B 24/7 – Soul Flow Instrumentals

Smooth Jazz & Soul R&B 24/7 – Soul Flow Instrumentals

Speed Movement Futuristic Esports Neon Red Arrows Background video | Footage | Screensaver

Speed Movement Futuristic Esports Neon Red Arrows Background video | Footage | Screensaver

ФИЛИППЕНКО: Трамп НЕ ВЕРИЛ, что ТАКОЕ СЛУЧИТСЯ! В США грядёт ЛЮТЫЙ БУНТ. Вашингтон ГУДИТ из-за Ирана

ФИЛИППЕНКО: Трамп НЕ ВЕРИЛ, что ТАКОЕ СЛУЧИТСЯ! В США грядёт ЛЮТЫЙ БУНТ. Вашингтон ГУДИТ из-за Ирана

Multiway Systems as Models to Understand Mind and Universe - a Conversation with Stephen Wolfram

Multiway Systems as Models to Understand Mind and Universe - a Conversation with Stephen Wolfram

Generalist AI beyond Deep Learning

Generalist AI beyond Deep Learning