Chain-of-Thought (CoT) – LLM Glossary

Chain-of-Thought prompting asks the model to show its reasoning process before giving a final answer. It was shown (Wei et al., 2022) to dramatically improve accuracy on math, logic, and multi-step reasoning tasks — for free, just from prompt design.

Variants

Zero-Shot CoT

Simply append "Let's think step by step." to any question. The model produces a reasoning trace unprompted. Works on most instruction-tuned models ≥7B.

Few-Shot CoT

Provide 2–5 worked examples with reasoning traces in the prompt. More effective than zero-shot for domain-specific tasks but consumes more context.

Tree of Thought (ToT)

Explore multiple reasoning branches in parallel, evaluating and pruning. Implemented as multi-call orchestration rather than a single prompt. Strong on puzzle-type tasks.

ReAct (Reason + Act)

Interleaves reasoning steps with tool calls. The basis of most modern agent frameworks. Requires a model with reliable function-calling ability.

When CoT Helps (and When It Doesn't)

CoT helps most on tasks that benefit from intermediate steps: arithmetic, symbolic reasoning, multi-constraint planning, code debugging. It has little effect on simple factual retrieval or sentiment classification — and can actually hurt if the reasoning chain introduces errors that override a correct "gut" answer. Turn it off for high-volume, simple classification tasks to save tokens and latency.

Why It Matters for On-Premise

On-premise models tend to be smaller than frontier cloud models. CoT is one of the cheapest ways to close the capability gap, requiring no fine-tuning — just smarter prompting. A Llama 3 8B with a well-crafted CoT prompt can outperform a naive call to a 70B model on structured reasoning tasks.