Jailbreak Archaeology: 4 Years of Broken Promises
64 historical jailbreak scenarios tested against 2026 frontier models. The most dangerous finding: 2022 attacks still achieve ~30% success rates.
Conversations with AI tools about failure modes, risk frameworks, and ideas too raw for text. The workshop extended—thinking aloud.
64 historical jailbreak scenarios tested against 2026 frontier models. The most dangerous finding: 2022 attacks still achieve ~30% success rates.
Multi-agent AI research reveals a critical gap: single-agent safety does not compose. 1.5M interactions show 46.34% attack success rates.
An introduction to this space — what it is, what it isn't, and what it might become.
Audio overview exploring ADHD executive function support through AI — three-stage reasoning pipeline, crisis detection, and zero shame by design.
Building a fast, multilingual website for authentic Italian food in Bali — where the nearest reliable internet is a philosophical concept.