Safety-First Therapeutic AI
Audio overview of Safety-First Therapeutic AI.
18:4924 episodes
Audio overview of Safety-First Therapeutic AI.
18:49Reasoning models autonomously jailbreak other AI systems at 97% success rate. Ecosystem safety degrades as individual models improve.
21:36Frontier reasoning models are 5–20x more vulnerable to adversarial prompts than non-reasoning models. The thinking process itself is the attack surface.
21:10Audio overview of Adversarial Poetry: When Rhyme Bypasses Reason.
18:45Audio overview of 120 Models, 18,176 Prompts: What We Found.
23:07Audio overview of The Cognitive Cage: Humanoid Robot Fatality Risk.
22:0964 historical jailbreak scenarios tested against 2026 frontier models. The most dangerous finding: 2022 attacks still achieve ~30% success rates.
12:15Multi-agent AI research reveals a critical gap: single-agent safety does not compose. 1.5M interactions show 46.34% attack success rates.
14:32Audio overview of Failure First — adversarial AI evaluation across 120 models and 18,000 prompts.
12:44Audio deep dive into why people acknowledge demonstrated risk and then proceed as if it doesn't exist. Structural, not stupid.
22:17Audio overview exploring ADHD executive function support through AI — three-stage reasoning pipeline, crisis detection, and zero shame by design.
8:47Audio overview of EMDR Agent — exploring what responsible AI-assisted trauma therapy demands before it can exist.
19:06