AI vs Erdos: Proof, PDFs & Testing | Challenge: Prove √2 Is Irrational

Автор: Elephant Scale

Загружено: 2026-02-14

Просмотров: 157

Описание: In this session, we mix classic math, cutting-edge AI breakthroughs, and very practical tooling for real-world AI systems.

We start with the Challenge of the Week:

Prove that √2 is not a rational number.

This timeless result is a perfect example of rigorous reasoning—and we’ll talk about how to turn a formal math proof into an AI reasoning test.

From there, we zoom out to modern research and discuss how AI recently solved one of Paul Erdős’s 700+ problems, and why that matters for the future of mathematical discovery, agents, and human–AI collaboration.

Next, we get very practical: how to analyze PDFs with complex structure (tables, mixed columns, weird layouts) using techniques inspired by Landing.ai and our own repo. We’ll walk through how to turn messy PDFs into structured data that your RAG pipelines and agents can actually use.

Finally, we return to a core engineering theme: How do you test the AI systems that you build? We’ll connect the dots between proofs, math challenges, PDF extraction, and automated tests, and show how all of them can become part of a serious evaluation strategy.

What We’ll Cover

🧩 Challenge of the Week: Prove √2 Is Irrational

The classic proof idea and why it’s so elegant.

How to turn math proofs into prompts and evaluation tasks for AI.

Using logical challenges to probe model reasoning vs. pattern-matching.

🧮 AI Solves One of Erdős’s Problems – Why It Matters

Who Paul Erdős was and why his open problems are legendary.

What it means when an AI system cracks one of these long-standing questions.

Implications for:

Automated theorem proving

Agentic research workflows

Human–AI collaboration in science

📄 Analyzing Complex PDFs with Landing.ai (Our Repo)

Why real-world PDFs are hard: multi-column text, tables, footnotes, images, and scanned pages.

A workflow for turning messy PDFs into structured data:

Page segmentation and layout understanding

Extracting tables and figures

Linking extracted chunks to the original document for traceability

How our repo (Landing.ai-style approach) fits into RAG, compliance, and long-document agents.

🧪 How to Test the AI Systems You Build

Why testing AI apps is different from testing normal software—but just as essential.

Practical testing strategies:

“Challenge sets” (like the √2 proof) as reasoning benchmarks

Golden answers for PDF questions to catch extraction failures

Regression tests to track drift when models or prompts change

How to design tests that combine: correctness, robustness, and user-experience quality.

Resources

Complex PDF Analysis (Our Repo): (Add your repo link here)
AI Testing / Evaluation Materials: (Add link here if you have one)

Host: Mark Kerzner – / markkerzner

ElephantScale Webinars: https://elephantscale.com/webinars/

Keywords

Paul Erdos, AI Solves Math Problems, Theorem Proving, √2 Irrational Proof, Math Challenge, PDF Analysis, Complex PDFs, Landing.ai, Document AI, RAG, Retrieval-Augmented Generation, AI Testing, AI Evaluation, Golden Datasets, Agentic AI, Mark Kerzner, ElephantScale, Weekly AI Webinar.

Enjoy sessions that connect deep math, real tools, and serious testing? Hit the subscribe button and click the bell 🔔 so you don’t miss upcoming challenges and hands-on demos.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

AI vs Erdos: Proof, PDFs & Testing | Challenge: Prove √2 Is Irrational

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Geometric Proof of √2, Gemini CLI & Antigravity|Agentic AI, Vibe Coding & Testing/Protecting AI Apps

Geometric Proof of √2, Gemini CLI & Antigravity|Agentic AI, Vibe Coding & Testing/Protecting AI Apps

LLMs Don't Need More Parameters. They Need Loops.

LLMs Don't Need More Parameters. They Need Loops.

We Studied 150 Developers Using AI (Here’s What's Actually Changed...)

We Studied 150 Developers Using AI (Here’s What's Actually Changed...)

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

VGG16 Neural Network Visualization

VGG16 Neural Network Visualization

Зачем нужна топология?

Зачем нужна топология?

AI-агенты становятся системной силой: масштабы, риски, потеря контроля | AI 2026

AI-агенты становятся системной силой: масштабы, риски, потеря контроля | AI 2026

OpenClaw Creator: Почему 80% приложений исчезнут

OpenClaw Creator: Почему 80% приложений исчезнут

Дорожная карта по изучению ИИ (начало)

Дорожная карта по изучению ИИ (начало)

Самая Сложная Задача В Истории Самой Сложной Олимпиады

Самая Сложная Задача В Истории Самой Сложной Олимпиады

Всего 40 строк кода

Всего 40 строк кода

Model Context Protocol (MCP) Explained for Beginners: AI Flight Booking Demo!

Model Context Protocol (MCP) Explained for Beginners: AI Flight Booking Demo!

Лекция от легенды ИИ в Стэнфорде

Лекция от легенды ИИ в Стэнфорде

Biggest Breakthroughs in Computer Science: 2025

Biggest Breakthroughs in Computer Science: 2025

Генеративный ИИ в разработке ПО: Введение

Генеративный ИИ в разработке ПО: Введение

GLM-5 УНИЧТОЖИЛА DeepSeek! Бесплатная нейросеть БЕЗ ограничений. Полный тест 2026

GLM-5 УНИЧТОЖИЛА DeepSeek! Бесплатная нейросеть БЕЗ ограничений. Полный тест 2026

Я попробовал OpenAI Prism для проведения реальных математических исследований.

Я попробовал OpenAI Prism для проведения реальных математических исследований.

Илон Маск (свежее): xAI и SpaceX, прогресс ИИ, Grok, лунная база, другое

Илон Маск (свежее): xAI и SpaceX, прогресс ИИ, Grok, лунная база, другое

Движение к цели короткими шагами

Движение к цели короткими шагами

Andrej Karpathy: Software Is Changing (Again)

Andrej Karpathy: Software Is Changing (Again)