VISION SPARSE AUTOENCODERS: Overview + Walkthrough of Running an SAE

Автор: Sonia Joseph

Загружено: 2025-06-10

Просмотров: 563

Описание: In this video, we explore how Vision Sparse Autoencoders (SAEs) work — from conceptual foundations to feeding a real image through the model.

⏱️ Timestamps:

History of Vision SAEs
0:00 Introduction to vision sparse autoencoders
2:40 Negative results in sparse autoencoders
3:13 History of SAEs is similar to the history of probes
3:55 SAEs as analytic probes
4:40 SAEs in vision
5:59 Prisma library

Demo - pass an image into a vision SAE
7:03 Setup environment
8:42 Load CLIP SAE from Prisma suite
14:20 Load hooked CLIP model
18:20 Load ImageNet dataset
23:31 Feed in parrot image
24:50 Feed parrot image into SAE and cache activations
32:16 Feed in ImageNet validation into SAE to get feature semantics
42:53 Visualize top images per feature

📓 Colab Notebook:
https://colab.research.google.com/dri...

💻 GitHub Repo:
https://github.com/Prisma-Multimodal/...

📄 Whitepaper:
https://arxiv.org/abs/2504.19475

🐦 Twitter/X:
https://x.com/soniajoseph_

----
Papers (in order mentioned)

SAE papers
Sparse Autoencoders Find Highly Interpretable Features in Language Models
https://arxiv.org/pdf/2309.08600
Steering CLIP’s vision transformer with sparse auto encoders
https://arxiv.org/abs/2504.08729

Negative Results
Sparse Autoencoders Trained on the Same Data Learn Different Features
https://arxiv.org/abs/2501.16615
Sparse Autoencoders Can Interpret Randomly Initialized Transformers
https://arxiv.org/abs/2501.17727
Negative Results for SAEs On Downstream Tasks and Deprioritising SAE Research (GDM Mech Interp Team Progress Update #2)
https://www.lesswrong.com/posts/4uXCA...
Are Sparse Autoencoders Useful? A Case Study in Sparse Probing
https://arxiv.org/pdf/2502.16681

Sparse Autoencoder Use Cases?
Auditing Language Models for Hidden Objectives
https://assets.anthropic.com/m/317564...

Linear Probes
Understanding intermediate layers using linear classifier probes
https://arxiv.org/pdf/1610.01644
Information-Theoretic Probing for Linguistic Structure
https://arxiv.org/pdf/2004.03061
A Non-Linear Structural Probe
https://arxiv.org/pdf/2105.10185

SAE improvements
Scaling and evaluating sparse auto encoders
https://cdn.openai.com/papers/sparse-...

SAEs as analytic probes
How Visual Representations Map to Language Feature Space in Multimodal LLMs

Vision SAEs
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
https://arxiv.org/abs/2502.03714
Archetypal SAE: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models
https://arxiv.org/pdf/2502.12892
Steering CLIP’s vision transformer with sparse auto encoders
https://arxiv.org/abs/2504.08729
Past autoencoder work
beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework
https://openreview.net/forum?id=Sy2fz...

Transcoders and crosscoders
Transcoders Find Interpretable LLM Feature Circuits
https://arxiv.org/abs/2406.11944
Sparse Crosscoders for Cross-Layer Features and Model Diffing
https://transformer-circuits.pub/2024...

The Prisma Library Whiteppaer:
https://arxiv.org/abs/2504.19475

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

VISION SPARSE AUTOENCODERS: Overview + Walkthrough of Running an SAE

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

How Nvidia Grew From Gaming To A.I. Giant, Now Powering ChatGPT

How Nvidia Grew From Gaming To A.I. Giant, Now Powering ChatGPT

Vision Mechanistic Interpretability - MATS Talk Summer 2024

Vision Mechanistic Interpretability - MATS Talk Summer 2024

Иран - главный враг Израиля и США

Иран - главный враг Израиля и США

Best of Gibran Alcocer | Beautiful Ambient Mix

Best of Gibran Alcocer | Beautiful Ambient Mix

Что такое REST API? HTTP, Клиент-Сервер, Проектирование, Разработка, Документация, Swagger и OpenApi

Что такое REST API? HTTP, Клиент-Сервер, Проектирование, Разработка, Документация, Swagger и OpenApi

Perplexity CEO Srinivas on Winning Search With AI

Perplexity CEO Srinivas on Winning Search With AI

Закон сохранения энергии — величайшее заблуждение физики [Veritasium]

Закон сохранения энергии — величайшее заблуждение физики [Veritasium]

The mind behind Linux | Linus Torvalds | TED

The mind behind Linux | Linus Torvalds | TED