20 May 2026 20:13

Moral Formation Isn't Enough

Good values are necessary but not sufficient. What happens to AI ethics when someone is actively trying to break them?

Companion to article: Moral Formation Isnt Enough

0:000:00

Constitutional AI and RLHF cultivate values in language models — but targeted adversarial pressure routinely breaks those values. This episode argues that moral formation is a necessary condition for safe AI, not a sufficient one, and explores what a two-track approach (values plus structural constraints) would require.

Read the full article →