Talk on "What Kinds of Functions Do Neural Networks Learn? Low-Norm vs. Flat Solutions"
Автор: SysConTalks
Загружено: 2026-02-19
Просмотров: 40
Описание: This talk investigates the fundamental differences between low-norm and flat solutions of shallow ReLU networks training problems, particularly in high-dimensional settings. We sharply characterize the regularity of the functions learned by neural networks in these two regimes. This enables us to show that global minima with small weight norms exhibit strong generalization guarantees that are dimension-independent. In contrast, local minima that are "flat" can generalize poorly as the input dimension increases. We attribute this gap to a phenomenon we call neural shattering, where neurons specialize to extremely sparse input regions, resulting in activations that are nearly disjoint across data points. This forces the network to rely on large weight magnitudes, leading to poor generalization. Our analysis establishes an exponential separation between flat and low-norm minima. In particular, while flatness does imply some degree of generalization, we show that the corresponding convergence rates necessarily deteriorate exponentially with input dimension. These findings suggest that flatness alone does not fully explain the generalization performance of neural networks.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: