Question 1

What's the difference between a system prompt and a regular prompt?

Accepted Answer

A system prompt sets persistent behavior across the entire conversation: role, rules, tone, output format. The user message is the turn-by-turn instruction. APIs ship them in different fields (OpenAI's `messages[0].role = system`, Anthropic's top-level `system` parameter). Most LLMs apply heavier weighting to the system prompt and resist overriding it via user messages, which is exactly why safety rules and persona belong in the system slot.

Question 2

How long can a system prompt be?

Accepted Answer

The hard limit is the model's context window: 128K to 1M+ tokens on 2026 flagship models, up to 2M on Grok 4.20 and Grok 4.1 Fast. The practical limit for performance is roughly 4,000 tokens before quality flattens. Above that, prompt caching becomes mandatory for cost. We push rules into the system prompt aggressively but reach for examples and retrieval before pushing past 4K tokens.

Question 3

Does prompt caching work with system prompts?

Accepted Answer

Yes, and this is the single highest-leverage optimization. OpenAI, Anthropic, and Google cache system prefixes at roughly 10% of standard input price. A 4,000-token system prompt reused 1M times per month on GPT-5.5 costs $20,000 uncached at $5/1M, but $2,000 cached at $0.50/1M. Claude Opus 4.7 sees the same 10x ratio ($5 vs $0.50). We turn caching on for every API product with a stable system message.

Question 4

Should I put examples in the system prompt or the user message?

Accepted Answer

Examples that apply to every conversation belong in the system prompt, where they ride the cache for free after the first call. Examples specific to one user request belong in the user message. Most few-shot use cases (consistent tone, fixed JSON schema, classification labels) are the first kind, so we default to the system slot and only move examples into user turns when they vary per request.

Question 5

Why is my system prompt being ignored?

Accepted Answer

Three causes, in order of frequency. First, the model is trained to follow user messages over system prompts when they conflict, so a forceful user instruction overrides a soft system rule. Second, the system prompt is too vague (no concrete examples) or too long (rules buried past 4K tokens). Third, smaller open-source variants do not strongly differentiate roles. We re-state critical rules in the user message as a reminder.

Question 6

Do all LLMs support system prompts?

Accepted Answer

Every API-grade model in our list does: OpenAI, Anthropic, Google, xAI, DeepSeek, Mistral, and Meta all expose a system role. Some open-source local models (older Llama variants, raw base models) lack an explicit system field and prepend the system content to the user message instead. Format conventions differ across providers: XML tags on Claude, markdown sections on OpenAI, plain prose on Gemini. The structural intent transfers.

System Prompt Builder

What is a system prompt?

How system prompts and prompt caching work together

Common pitfalls

When to use this tool

Frequently asked

More Tools