Tag: ai-safety

3 projects

Failure First

Adversarial evaluation framework for embodied AI. 120+ models, 18,000+ prompts, four headline findings, one arXiv preprint.

Why do people acknowledge evidence of harm and then proceed as if it doesn't exist? A deep dive into structural risk dismissal.

What safety architecture does AI-assisted trauma therapy require before it has any business existing? Built to find out.