Comparing Claude System Prompts Reveal Anthropic's Priorities

https://news.ycombinator.com/rss Hits: 16
Summary

Claude 4’s system prompt is very similar to the 3.7 prompt we analyzed last month. They’re nearly identical, but the changes scattered throughout reveal much about how Anthropic is using system prompts to define their applications (specifically their UX) and how the prompts fit into their development cycle. Let’s step through the notable changes. Old Hotfixes Are Gone, New Hotfixes Begin We theorized that many random instructions, targeting common LLM “gotchas”, were hot-fixes: short instructions to address undesired behavior prior to a more robust fix. Claude 4.0’s system prompt validates this hypothesis - all the 3.7 hot-fixes have been removed. However, if we prompt Claude with one of the “gotchas” (“How many R’s are in Strawberry?”, for example) it doesn’t fall for the trick. The 3.7 hot-fix behaviors are almost certainly being addressed during 4.0’s post-training, through reinforcement learning. When the new model is trained to avoid “hackneyed imagery” in its poetry and think step-by-step when counting words or letters, there’s no need for a system prompt fix. Once 4.0’s training is done, new issues will emerge that must be addressed by the system prompt. For example, here’s a brand new instruction in Sonnet 4.0’s system prompt: Claude never starts its response by saying a question or idea or observation was good, great, fascinating, profound, excellent, or any other positive adjective. It skips the flattery and responds directly. This hot-fix is clearly inspired by OpenAI’s ‘sychophant-y’ GPT-4o flub. This misstep occurred about a month ago, too late for the Anthropic team to conduct new training targeting this behavior. So into the system prompt it goes! Search is Now Encouraged Way back in 2023, it was common for chatbots to flail about when asked about topics that occurred after its cut-off date. Early adopters learned LLMs are frozen in time, but casual users were frequently tripped up by hallucinations and errors when asking about recent news. Perplexity...

First seen: 2025-06-04 23:48

Last seen: 2025-06-05 15:56