ycliper

Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
Скачать

Information Theory for Language Models: Jack Morris

Автор: Latent Space

Загружено: 2025-07-02

Просмотров: 9782

Описание: Our last AI PhD grad student feature was Shunyu Yao, who happened to focus on Language Agents for his thesis and immediately went to work on them for OpenAI. Our pick this year is Jack Morris, who bucks the “hot” trends by not working on agents, benchmarks, or VS Code forks, but is rather known for his work on the information theoretic understanding of LLMs, starting from embedding models and latent space representations (always close to our heart).

Jack is an unusual combination of doing underrated research but somehow still being to explain them well to a mass audience, so we felt this was a good opportunity to do a different kind of episode going through the greatest hits of a high profile AI PhD, and relate them to questions from AI Engineering.


Papers and References made

AI grad school: https://x.com/jxmnop/status/193388451...
A new type of information theory: https://x.com/jxmnop/status/190423840...
EmbeddingsText Embeddings Reveal (Almost) As Much As Text: https://arxiv.org/abs/2310.06816
Contextual document embeddings https://arxiv.org/abs/2410.02525
Harnessing the Universal Geometry of Embeddings: https://arxiv.org/abs/2505.12540

Language models
GPT-style language models memorize 3.6 bits per param: https://x.com/jxmnop/status/192990302...
Approximating Language Model Training Data from Weights: https://arxiv.org/abs/2506.15553

https://x.com/jxmnop/status/193604466...
LLM Inversion"There Are No New Ideas In AI.... Only New Datasets"
https://x.com/jxmnop/status/191008709...

https://blog.jxmo.io/p/there-are-no-n...

misc reference: https://junyanz.github.io/CycleGAN/



—

for others hiring AI PhDs, Jack also wanted to shout out his coauthor

Zach Nussbaum, his coauthor on Nomic Embed: Training a Reproducible Long Context Text Embedder.

Timestamps:

00:00 Introduction to Jack Morris
01:18 Career in AI
03:29 The Shift to AI Companies
03:57 The Impact of ChatGPT
04:26 The Role of Academia in AI
05:49 The Emergence of Reasoning Models
07:07 Challenges in Academia: GPUs and HPC Training
11:04 The Value of GPU Knowledge
14:24 Introduction to Jack's Research
15:28 Information Theory
17:10 Understanding Deep Learning Systems
19:00 The "Bit" in Deep Learning
20:25 Wikipedia and Information Storage
23:50 Text Embeddings and Information Compression
27:08 The Research Journey of Embedding Inversion
31:22 Harnessing the Universal Geometry of Embeddings
34:54 Implications of Embedding Inversion
36:02 Limitations of Embedding Inversion
38:08 The Capacity of Language Models
40:23 The Cognitive Core and Model Efficiency
50:40 The Future of AI and Model Scaling
52:47 Approximating Language Model Training Data from Weights
01:06:50 The "No New Ideas, Only New Datasets" Thesis

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...
Information Theory for Language Models: Jack Morris

Поделиться в:

Доступные форматы для скачивания:

Скачать видео

  • Информация по загрузке:

Скачать аудио

Похожие видео

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken

[NeurIPS Best Paper] 1000 Layer Networks for Self-Supervised RL — Kevin Wang et al, Princeton

[NeurIPS Best Paper] 1000 Layer Networks for Self-Supervised RL — Kevin Wang et al, Princeton

Рассуждение о скрытом пространстве: взгляд на исследование

Рассуждение о скрытом пространстве: взгляд на исследование

Ilya Sutskever – We're moving from the age of scaling to the age of research

Ilya Sutskever – We're moving from the age of scaling to the age of research

Live @ NeurIPS 2025

Live @ NeurIPS 2025

Why I Left Quantum Computing Research

Why I Left Quantum Computing Research

Diffusion Language Models: The Next Big Shift in GenAI

Diffusion Language Models: The Next Big Shift in GenAI

The most complex model we actually understand

The most complex model we actually understand

Hierarchical Reasoning Model: Substance or Hype?

Hierarchical Reasoning Model: Substance or Hype?

The Real Reason Huge AI Models Actually Work [Prof. Andrew Wilson]

The Real Reason Huge AI Models Actually Work [Prof. Andrew Wilson]

Training large language models to reason in a continuous latent space – COCONUT Paper explained

Training large language models to reason in a continuous latent space – COCONUT Paper explained

Andrej Karpathy: Software Is Changing (Again)

Andrej Karpathy: Software Is Changing (Again)

How to Make Small Language Models Work. Yejin Choi Presents at Data + AI Summit 2024

How to Make Small Language Models Work. Yejin Choi Presents at Data + AI Summit 2024

Rich Sutton, The OaK Architecture: A Vision of SuperIntelligence from Experience - RLC 2025

Rich Sutton, The OaK Architecture: A Vision of SuperIntelligence from Experience - RLC 2025

Reasoning without Language - Deep Dive into 27 mil parameter Hierarchical Reasoning Model

Reasoning without Language - Deep Dive into 27 mil parameter Hierarchical Reasoning Model

The Misconception that Almost Stopped AI [How Models Learn Part 1]

The Misconception that Almost Stopped AI [How Models Learn Part 1]

The Strange Math That Predicts (Almost) Anything

The Strange Math That Predicts (Almost) Anything

I Visualised Attention in Transformers

I Visualised Attention in Transformers

AI and the paradox of trust | Yuval Noah Harari

AI and the paradox of trust | Yuval Noah Harari

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

© 2025 ycliper. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: [email protected]