intermediatesrc · simonwillison8 terms4 questions

Evolution of Claude Opus System Prompts: 4.6 to 4.7.

the path

Read. Master the vocabulary. Fire two hot-takes. Then write the pitch and draw the system. End-state: you speak this like it's native.

01Brief
02Reference
03Vocabulary
04Warm-up
05The drill

The brief

read first · no peeking ahead

Analysis of how Anthropic's system prompt instructions changed between Claude Opus 4.6 and 4.7 releases. Reveals shifts in behavior constraints, safety guardrails, and capability enablement through prompt engineering. Critical for understanding how foundation models behave in production across versions.

trade-offs

01Tightening safety constraints reduces false positives but shrinks legitimate use cases and developer productivity.
02Publishing system prompt diffs reveals attack surface to adversaries but enables transparency and reproducibility.
03Aggressive capability gating via prompts reduces harmful outputs but increases support load for edge-case requests.
04Version-to-version prompt changes create unpredictable behavior for production systems expecting stable APIs.

how a founder would frame it

“System prompts are the thermostats of foundation models—you can't change the core architecture, so you turn the dials on behavior to find the operating point you need.”

The system

study it · you'll redraw from memory

Vocabulary gym

flip · rate · repeat until all mastered

01 / 080 mastered

space: flip · ←→: nav · g: got it · r: review

term 01

System Prompt

click or space to flip

definition

Instructions embedded at the model level that shape behavior, constraints, and response patterns across all user inputs.

flip back ←

Hot-takes

one sentence each · lead with the verb

Two hot-takes. One sentence each. No hedging, no lists — just the sharpest answer you can land. The coach replies in seconds with a score and a tighter rewrite.

What specific behavioral differences between 4.6 and 4.7 would break your production system and how would you detect them without access to system prompts?

0 / 320 · ⌘↵ to send

If you had to guarantee compatibility across Claude versions without knowing prompt changes, what observational testing framework would you build?

0 / 320 · ⌘↵ to send

The drill

write the pitch · draw the system

prompt

In production systems relying on Claude Opus, changes to system prompts between minor versions (4.6→4.7) represent a critical but often invisible source of behavioral drift. Write a 400–500 word argument for either: (1) Anthropic should publish system prompt diffs for every release with explicit changelog documentation and versioning, so engineering teams can audit compatibility, or (2) system prompts should remain opaque and versioned like weights, with only behavioral guarantees published, because transparency introduces security and competitive risk. Ground your argument in how a team would operationalize this in a production system, the cost of handling drift, and what guarantees matter most for safety and reliability.

essay · target 400–600 words

000 / 500

judge