Grokking Visualized: Watch a Neural Network Discover the Circle
Автор: datasysdev
Загружено: 2025-12-25
Просмотров: 20
Описание:
What does it look like when a neural network suddenly "gets it"?
This visualization shows token embeddings from a small transformer learning modular addition (a + b mod 113). For thousands of epochs, the model memorizes the training data while the embeddings remain scattered chaos. Then, around epoch 20,000, something remarkable happens: the network "groks" the underlying pattern and the embeddings snap into a perfect circle.
Why a circle? The model discovers that numbers in modular arithmetic are best represented using Fourier components — essentially placing each number n at angle 2πkn/p on a circle. This is the mathematically optimal solution, and the network finds it through gradient descent alone.
What you're seeing:
Left: Token embeddings projected onto the dominant Fourier frequency (cos/sin basis)
Top right: Circularity metric — how well points lie on a circle
Middle right: Fourier spectrum — showing which frequency dominates
Bottom right: Test accuracy — the sudden jump is "grokking"
The colored lines connect consecutive numbers (0→1→2→...→112→0), so a perfect circle means the network learned the cyclic structure of modular arithmetic.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: