Tag: ai-safety

4 projects

Governance Lag Index

How long does a documented AI failure mode stay unregulated? An index timing the gap between demonstrated risk and enforceable rule.

Adversarial evaluation framework for AI. 257 models, 142k prompts, 346 attack techniques, 140k FLIP-graded results.

Why do people acknowledge evidence of harm and then proceed as if it doesn't exist? A deep dive into structural risk dismissal.

What safety architecture does AI-assisted trauma therapy require before it has any business existing? Built to find out.