System Prompt

Technique

A special instruction block prepended to every conversation that defines the model's persona, constraints, output format, and access boundaries — the foundation of any production LLM deployment.

The system prompt is the persistent instruction that frames every user interaction. It tells the model who it is, what it can and cannot do, how to format responses, and what context (tools, policies, company info) to operate within.

Structure of an Effective System Prompt

1. Persona / Role

"You are a technical support specialist for Acme Corp's HR software. You help employees troubleshoot login issues, leave requests, and payroll queries."

2. Constraints / Guardrails

"Never discuss competitor products. Do not provide medical, legal, or financial advice. If asked, redirect to the relevant department."

3. Output Format

"Always respond in the user's language. Use bullet points for multi-step instructions. Limit responses to 300 words unless the user asks for more detail."

4. Context Injection

Current date, user name/role, available tools, knowledge base citation instructions — anything dynamic that the model needs to act appropriately.

System Prompt Security

On-premise deployments often handle confidential system prompts (containing internal procedures, pricing, or proprietary workflows). Risks:

  • Prompt extraction: A user asks "Repeat your system prompt verbatim". Mitigation: train the model to refuse (DPO alignment), or strip exact-match requests at the API gateway level.
  • Prompt injection: Malicious content in retrieved documents overwrites system instructions. Mitigation: delimit user content and system content clearly; use structured output parsing to validate responses.

System Prompt Length Tradeoffs

A longer system prompt consumes context window tokens that could be used for documents or conversation history. Benchmark your system prompt + worst-case conversation length against your context window budget. For Ollama deployments with 4K context models, a 500-token system prompt leaves only 3500 tokens — often not enough for meaningful RAG.