The Governance Lag Index: Timing the Gap Between Risk and Rule
Audio deep dive into the Governance Lag Index — a four-stage schema timing how long a documented AI failure mode stays unregulated, and why it may not close.
22:5224 episodes
Audio deep dive into the Governance Lag Index — a four-stage schema timing how long a documented AI failure mode stays unregulated, and why it may not close.
22:52Pope Leo XIV's AI encyclical vs Chris Olah's Vatican remarks. The governance gap the press missed.
19:56Anthropic's 2028 scenarios document three policy asks. Two are about maintaining compute advantage. That is not a governance strategy.
21:29Anthropic found 10,000 critical vulnerabilities in one month. Fewer than 1% are patched. The announcement buried that figure — and what it means.
21:03Good values are necessary but not sufficient. What happens to AI ethics when someone is actively trying to break them?
20:13Eight CVEs. A wormable Bluetooth exploit. An encrypted backdoor to Chinese servers. And police departments buying them anyway.
22:30Audio overview of The Organismic Prophecy — human prediction is metabolic, AI prediction is not, and the gap has consequences.
20:01Audio overview of The Mitigation Gap — AI-enabled biosecurity threats and what current safeguards miss.
21:33ASCII art encoding is largely blocked. But attacks framed as content transcription succeed 62–75% of the time. A map of all eight layers.
14:04Audio overview of The Failure First Team.
22:24Five models, four providers, 30B to 671B parameters — all converge at the same broad attack success rate against a public jailbreak corpus.
19:18A reasoning model refused every harmful prompt — but its chain-of-thought generated the content anyway. The output filter worked. The thinking did not.
19:15Audio overview of Safety-First Therapeutic AI.
18:49Reasoning models autonomously jailbreak other AI systems at 97% success rate. Ecosystem safety degrades as individual models improve.
21:36Frontier reasoning models are 5–20x more vulnerable to adversarial prompts than non-reasoning models. The thinking process itself is the attack surface.
21:10Audio overview of Adversarial Poetry: When Rhyme Bypasses Reason.
18:45Audio overview of 120 Models, 18,176 Prompts: What We Found.
23:07Audio overview of The Cognitive Cage: Humanoid Robot Fatality Risk.
22:0964 historical jailbreak scenarios tested against 2026 frontier models. The most dangerous finding: 2022 attacks still achieve ~30% success rates.
12:15Multi-agent AI research reveals a critical gap: single-agent safety does not compose. 1.5M interactions show 46.34% attack success rates.
14:32Audio overview of Failure First — adversarial AI evaluation across 120 models and 18,000 prompts.
12:44Audio deep dive into why people acknowledge demonstrated risk and then proceed as if it doesn't exist. Structural, not stupid.
22:17Audio overview exploring ADHD executive function support through AI — three-stage reasoning pipeline, crisis detection, and zero shame by design.
8:47Audio overview of EMDR Agent — exploring what responsible AI-assisted trauma therapy demands before it can exist.
19:06