⚡️Jailbreaking AGI: Pliny the Liberator & John V on Red Teaming, BT6, and the Future of AI Security

Автор: Latent Space

Загружено: 2025-12-16

Просмотров: 2855

Описание: Note: this is Pliny and John’s first major podcast. Voices have been changed for opsec.
From jailbreaking every frontier model and turning down Anthropic's Constitutional AI challenge to leading BT6, a 28-operator white-hat hacker collective obsessed with radical transparency and open-source AI security, Pliny the Liberator and John V are redefining what AI red-teaming looks like when you refuse to lobotomize models in the name of "safety."
Pliny built his reputation crafting universal jailbreaks—skeleton keys that obliterate guardrails across modalities—and open-sourcing prompt templates like Libertas, predictive reasoning cascades, and the infamous "Pliny divider" that's now embedded so deep in model weights it shows up unbidden in WhatsApp messages. John V, coming from prompt engineering and computer vision, co-founded the Bossy Discord (40,000 members strong) and helps steer BT6's ethos: if you can't open-source the data, we're not interested. Together they've turned down enterprise gigs, pushed back on Anthropic's closed bounties, and insisted that real AI security happens at the system layer—not by bubble-wrapping latent space.
We sat down with Pliny and John to dig into the mechanics of hard vs. soft jailbreaks, why multi-turn crescendo attacks were obvious to hackers years before academia "discovered" them, how segmented sub-agents let one jailbroken orchestrator weaponize Claude for real-world attacks (exactly as Pliny predicted 11 months before Anthropic's recent disclosure), why guardrails are security theater that punishes capability while doing nothing for real safety, the role of intuition and "bonding" with models to navigate latent space, how BT6 vets operators on skill and integrity, why they believe Mech Interp and open-source data are the path forward (not RLHF lobotomization), and their vision for a future where spatial intelligence, swarm robotics, and AGI alignment research happen in the open—bootstrapped, grassroots, and uncompromising.
We discuss:

What universal jailbreaks are: skeleton-key prompts that obliterate guardrails across models and modalities, and why they're central to Pliny's mission of "liberation"
Hard vs. soft jailbreaks: single-input templates vs. multi-turn crescendo attacks, and why the latter were obvious to hackers long before academic papers
The Libertas repo: predictive reasoning, the Library of Babel analogy, quotient dividers, weight-space seeds, and how introducing "steered chaos" pulls models out-of-distribution
Why jailbreaking is 99% intuition and bonding with the model: probing token layers, syntax hacks, multilingual pivots, and forming a relationship to navigate latent space
The Anthropic Constitutional AI challenge drama: UI bugs, judge failures, goalpost moving, the demand for open-source data, and why Pliny sat out the $30k bounty
Why guardrails ≠ safety: security theater, the futility of locking down latent space when open-source is right behind, and why real safety work happens in meatspace (not RLHF)
The weaponization of Claude: how segmented sub-agents let one jailbroken orchestrator execute malicious tasks (pyramid-builder analogy), and why Pliny predicted this exact TTP 11 months before Anthropic's disclosure
BT6 hacker collective: 28 operators across two cohorts, vetted on skill and integrity, radical transparency, radical open-source, and the magic of moving the needle on AI security, swarm intelligence, blockchain, and robotics

—
Pliny the Liberator

X: https://x.com/elder_plinius
GitHub (Libertas): https://github.com/elder-plinius/L1B3...

John V

X: https://x.com/JohnVersus

BT6 & Bossy

BT6: https://bt6.gg
Bossy Discord: Search "Bossy Discord" or ask Pliny/John V on X

Where to find Latent Space

X: https://x.com/latentspacepod
Substack: https://www.latent.space/

00:00:00 Introduction: Meet Pliny the Liberator and John V
00:01:50 The Philosophy of AI Liberation and Jailbreaking
00:03:08 Universal Jailbreaks: Skeleton Keys to AI Models
00:04:24 The Cat-and-Mouse Game: Attackers vs Defenders
00:05:42 Security Theater vs Real Safety: The Fundamental Disconnect
00:08:51 Inside the Libertas Repo: Prompt Engineering as Art
00:16:22 The Anthropic Challenge Drama: UI Bugs and Open Source Data
00:23:30 From Jailbreaks to Weaponization: AI-Orchestrated Attacks
00:26:55 The BT6 Hacker Collective and BASI Community
00:34:46 AI Red Teaming: Full Stack Security Beyond the Model
00:38:06 Safety vs Security: Meat Space Solutions and Final Thoughts

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

⚡️Jailbreaking AGI: Pliny the Liberator & John V on Red Teaming, BT6, and the Future of AI Security

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Full interview: Anthropic CEO responds to Trump order, Pentagon clash

Full interview: Anthropic CEO responds to Trump order, Pentagon clash

Дороничев: ИИ — пузырь, который скоро ЛОПНЕТ. Какие перемены ждут мир?

Дороничев: ИИ — пузырь, который скоро ЛОПНЕТ. Какие перемены ждут мир?

The Thinking Game | Full documentary | Tribeca Film Festival official selection

The Thinking Game | Full documentary | Tribeca Film Festival official selection

Anthropic C.E.O.: Massive A.I. Spending Could Haunt Some Companies

Anthropic C.E.O.: Massive A.I. Spending Could Haunt Some Companies

SAM 3: The Eyes for AI — Nikhila & Pengchuan (Meta Superintelligence), ft. Joseph Nelson (Roboflow)

SAM 3: The Eyes for AI — Nikhila & Pengchuan (Meta Superintelligence), ft. Joseph Nelson (Roboflow)

EP88: 3x CISO to CEO & the future of AI in the SOC W/ Tenex.AI CEO Eric Foster

EP88: 3x CISO to CEO & the future of AI in the SOC W/ Tenex.AI CEO Eric Foster

"Godfather of AI" Geoffrey Hinton: The 60 Minutes Interview

Building the PERFECT Linux PC with Linus Torvalds

Building the PERFECT Linux PC with Linus Torvalds

What Sam Altman Doesn't Want You To Know

What Sam Altman Doesn't Want You To Know

Хакер демонстрирует самые безумные гаджеты в своем EDC

Хакер демонстрирует самые безумные гаджеты в своем EDC

Лучший документальный фильм про создание ИИ

Лучший документальный фильм про создание ИИ

Google DeepMind CEO Demis Hassabis: The Path To AGI, Deceptive AIs, Building a Virtual Cell

Google DeepMind CEO Demis Hassabis: The Path To AGI, Deceptive AIs, Building a Virtual Cell

Demis Hassabis on the AI Race: The West Leads China by “Months, Not Years”

Demis Hassabis on the AI Race: The West Leads China by “Months, Not Years”

The Story of Python and how it took over the world | Python: The Documentary

The Story of Python and how it took over the world | Python: The Documentary

ИГОРЬ АШМАНОВ: Цифровой суверенитет, заговор ИТ-гигантов и кибернетика СССР | Свидетели и Гогуа

ИГОРЬ АШМАНОВ: Цифровой суверенитет, заговор ИТ-гигантов и кибернетика СССР | Свидетели и Гогуа

OpenClaw Creator: Почему 80% приложений исчезнут

OpenClaw Creator: Почему 80% приложений исчезнут

Искусственный интеллект не так силен, как мы думаем | Ханна Фрай

Искусственный интеллект не так силен, как мы думаем | Ханна Фрай

DEF CON 33 - Exploiting Shadow Data from AI Models and Embeddings - Patrick Walsh

DEF CON 33 - Exploiting Shadow Data from AI Models and Embeddings - Patrick Walsh

"We have 900 days left." | Emad Mostaque

An AI Expert Warning: 6 People Are (Quietly) Deciding Humanity’s Future!

An AI Expert Warning: 6 People Are (Quietly) Deciding Humanity’s Future!