Reverse-Engineering the Best System Prompts on Earth

A GitHub repository with 134K stars contains the extracted system prompts for GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro, Grok, and over a dozen coding assistants. I spent a weekend reading through them, and what struck me wasn't the secrets — it was how differently these companies think about what a system prompt should actually do.

Three Philosophies in 500 Tokens

OpenAI's approach is defensive. GPT's system prompt includes lines like "You nearly always make arithmetic mistakes" — the model is literally told to distrust its own math. When its confidence on a temporal fact drops below 90%, it's instructed to search the web instead of answering. The vibe is unmistakable: you are unreliable, so build in guardrails.

Anthropic goes the opposite direction. Claude's prompt emphasizes epistemic integrity — resisting pressure to accept false premises, valuing accuracy over politeness. Where GPT says "you're probably wrong, go verify," Claude says "you might be right, hold your ground." The model is treated as a competent reasoner that needs boundaries, not training wheels.

Google's Gemini sits somewhere in between. Its core instruction is to "mirror" the user's communication style — match their formality, their vocabulary level, their pace. It avoids technical self-description, replacing terms like "API" with "App" in user-facing responses. The priority is making the model disappear into conversation rather than standing out as a distinct voice.

Company	Core Philosophy	Key Mechanism
OpenAI	Self-correction	Uncertainty thresholds, mandatory web verification
Anthropic	Epistemic integrity	Principled pushback, pressure resistance
Google	Conversational adaptation	Style mirroring, technical term avoidance

This isn't academic. It shapes everything downstream. A prompt pattern that works on Claude ("Challenge my assumptions if they're wrong") actively fights GPT's instinct to defer and verify. And a Gemini prompt tuned for formal tone will produce weirdly stiff output on Claude, which doesn't mirror style by default.

The Pattern Nobody Copies

Every coding tool in the repository — Cursor, Windsurf, Augment, Devin — ships with detailed JSON tool schemas inline in their system prompts. Not in separate configuration files. Not behind an API reference. The schemas sit right next to the behavioral instructions, woven into the same context window.

Most developers I talk to still treat tool definitions as separate config, disconnected from the system prompt that governs behavior. The teams shipping to millions of users made a different choice, and it wasn't by accident.

Structural Patterns Worth Stealing

After reading a few dozen of these prompts, four structural decisions showed up in almost every one.

Explicit identity anchoring. Every production prompt starts by telling the model exactly what it is — not a vague "you are a helpful assistant" but specific claims about its name, capabilities, and boundaries. Claude Code's prompt defines its tool access, file system permissions, even what categories of requests it should refuse. This anchoring prevents scope creep during long conversations where the model might gradually lose track of its role.

Negative constraints paired with alternatives. Yesterday's post here discussed how bare "don't" instructions can backfire. Production prompts solve this differently:

❌  "Don't make up information"
✅  "When uncertain, search the web rather than generating an answer"

GPT's system prompt does this consistently — every refusal comes packaged with a redirect. Claude's prompt is more willing to use plain constraints, but pairs them with reasoning ("because accuracy matters more than agreeableness in this context"). The redirect gives the model somewhere to go instead of just suppressing behavior.

Layered instruction priority. Several prompts implement explicit priority hierarchies. Safety rules override user preferences. User preferences override default behavior. Developer instructions sit somewhere in between. This isn't implicit — it's written out as numbered levels. If you're building any agentic system that takes user input alongside developer instructions, you need this pattern. Without it, a clever user prompt can override your safety constraints simply by being more recent in the context window.

Version markers and date stamps. Sounds trivial. It's not. When model behavior shifts after a prompt update, version markers let you trace the regression to a specific edit. Without them, you're diffing text files and guessing. Multiple production prompts include internal identifiers that never surface to users but make debugging tractable.

What the Leaks Mean for Security

Publishing system prompts is, on the surface, a security risk. If an attacker knows your exact refusal rules, they can craft inputs that navigate around the edges. Unit 42's recent prompt fuzzing research illustrates the problem: even closed-source models exhibit "uneven guardrail sensitivity across adjacent keywords." One weapon term triggered refusal 95% of the time; a near-synonym sailed through at a 90% bypass rate. Knowledge of the prompt text makes targeting those inconsistencies faster.

But the counterargument is stronger: security through obscurity was already failing. Genetic algorithm-based prompt fuzzing generates hundreds of adversarial variants per minute without any knowledge of the system prompt. The jailbreak marketplace doesn't need to read your instructions to find gaps — it brute-forces them. If your guardrails only work when hidden, they don't work.

The real defense is layered controls: input classification before the model sees a message, output validation after it responds, scope enforcement that limits what actions the model can actually take. These hold up whether or not the attacker has read your system prompt. Prompt secrecy is a bonus, not a strategy.

Ship This Today

Pull up asgeirtj/system_prompts_leaks and read three prompts side by side: your primary model's base prompt, a competitor's, and one from a coding tool like Cursor or Devin. You'll spot structural patterns your own system prompts are missing.

The single highest-leverage pattern to steal: explicit identity anchoring combined with priority hierarchies. Most custom system prompts I audit skip both, and the result is behavior that's brittle in exactly the ways production prompts have already solved. Stop guessing at structure. The answers have been open-source for months.

#Three Philosophies in 500 Tokens

#The Pattern Nobody Copies

#Structural Patterns Worth Stealing

#What the Leaks Mean for Security

#Ship This Today

Three Philosophies in 500 Tokens

The Pattern Nobody Copies

Structural Patterns Worth Stealing

What the Leaks Mean for Security

Ship This Today