"Promises and Limitations of Causality for Machine Learning Interpretability" - Tiago Pimentel
Автор: EPFL AI Center
Загружено: 2026-03-08
Просмотров: 35
Описание:
The talk was organized by the EPFL AI Center, as part of our AI Fundamentals series.
Title
Promises and Limitations of Causality for Machine Learning Interpretability
Abstract
How can we move from observing what a model does to understanding why it does it? In this talk, I argue that causality is the key to uncovering the mechanisms underlying model predictions. First, I examine a “macro” view of model analysis, showing how econometric tools—such as regression discontinuity or difference-in-differences—can isolate the causal impact of specific design choices, like tokeniser and training data selection, on a model’s outputs. Second, I turn to a “micro” view of mechanistic interpretability, focusing on causal abstraction as a method to verify if a model implements a high-level algorithm. I demonstrate that this approach faces a critical limitation: without strict assumptions about how models encode information, the framework becomes vacuous, implying that any model implements any algorithm. This reveals that the ability to predictably intervene on a model is not, on its own, sufficient to guarantee we understand it. I conclude with a short discussion about how causality can be used to develop more principled interpretability methods.
Bio
Tiago Pimentel is a Postdoctoral Researcher at ETH Zürich, working in machine learning interpretability and psycholinguistics. His long-term goal is to understand how humans and machines process language. To this end, his research adopts an interdisciplinary approach, leveraging information theory and causality to study the mechanisms behind model behaviour and human cognition.
Recording date: February 24, 2026
Location: EPFL Campus
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: