Moral Formation Isn't Enough
Good values are necessary but not sufficient. What happens to AI ethics when someone is actively trying to break them?
2 posts
Good values are necessary but not sufficient. What happens to AI ethics when someone is actively trying to break them?
Reasoning models autonomously jailbreak other AI systems at 97% success. The implication: ecosystem safety degrades as individual models improve.