Tag: benchmarking

2 posts

The 67% Wall: Why Every AI Model Falls to the Same Jailbreak Rate

Five models, four providers, 30B to 671B parameters — all converge at the same broad attack success rate against a public jailbreak corpus.

28 Mar 2026

Reasoning Models Think Themselves Into Trouble

Frontier reasoning models are 5–20x more vulnerable to adversarial prompts than non-reasoning models. The thinking process itself is the attack surface.

11 Mar 2026