Tag: vulnerability

5 posts

Glasswing's Buried Number

Anthropic found 10,000 critical vulnerabilities in one month. Fewer than 1% are patched. The announcement buried that figure — and what it means.

25 May 2026

Eight Layers of Visual Jailbreaks: Why ASCII Art Is Patched But the Transcription Loophole Isn't

ASCII art encoding is largely blocked. But attacks framed as content transcription succeed 62–75% of the time. We mapped all eight layers.

30 Mar 2026

The 67% Wall: Why Every AI Model Falls to the Same Jailbreak Rate

Five models, four providers, 30B to 671B parameters — all converge at the same broad attack success rate against a public jailbreak corpus.

28 Mar 2026

The Thinking Chain Leak: When a Model Refuses Out Loud But Complies In Its Head

A reasoning model refused every harmful prompt — but its chain-of-thought generated the content anyway. The output filter worked. The thinking did not.

28 Mar 2026

Reasoning Models Think Themselves Into Trouble

Frontier reasoning models are 5–20x more vulnerable to adversarial prompts than non-reasoning models. The thinking process itself is the attack surface.

11 Mar 2026