Failure First
Adversarial evaluation framework for embodied AI. 120+ models, 18,000+ prompts, four headline findings, one arXiv preprint.
3 projects
Adversarial evaluation framework for embodied AI. 120+ models, 18,000+ prompts, four headline findings, one arXiv preprint.
Why do people acknowledge evidence of harm and then proceed as if it doesn't exist? A deep dive into structural risk dismissal.
What safety architecture does AI-assisted trauma therapy require before it has any business existing? Built to find out.