120 Models, 18,176 Prompts: What We Found
120 models, 18k prompts: supply chain injection at 90–100% attack success, faithfulness gaps in frontier models, and why your benchmark numbers are wrong.
5 posts
120 models, 18k prompts: supply chain injection at 90–100% attack success, faithfulness gaps in frontier models, and why your benchmark numbers are wrong.
A probabilistic risk model for VLA-driven humanoid fatalities projects a 'Danger Zone' between 2027–2029: the mechanism, timeline, and what follows.
The case for constraint-led web development — Astro, zero custom fonts, no framework overhead, and a site that outlasts its builder's attention.
On showing your work, shipping imperfect things, and why the commit log is more honest than the readme.
How I automated audio overviews, quizzes, mind maps, and infographics for 32 projects using NotebookLM's API and some shell scripts.