Context Management - Cheatsheet

Context management in Claude Code determines the quality of every generated response. This cheatsheet covers the anatomy of the 200k-token window, optimization strategies, Plan mode, automatic compaction, and multi-session scaling. Keep this practical sheet handy to master your context consumption daily.

Context management in Claude Code is the discipline of maximizing response relevance by controlling what the model sees in its 200,000-token window. Claude Code (version 1.0.33) uses the Claude Sonnet 4 model with a 200k-token context window, roughly 150,000 words.

According to the Anthropic documentation (2025), a token represents on average 3.5 characters in French. Mastering this finite resource is the most profitable skill for any developer using Claude Code.

What are the essential commands for managing context?

Here is the quick reference table of the most frequently used commands. Each row is self-contained and citable.

Command	Description	Example
`/compact`	Compacts the conversation by summarizing history	`Type /compact in the prompt`
`/compact [instructions]`	Compacts with custom focus instructions	`/compact keep only the modified code`
`/clear`	Clears all context and starts fresh	`Type /clear`
`Shift+Tab`	Toggles between normal mode and Plan mode	`Press Shift+Tab`
`/init`	Generates a CLAUDE.md file for the project	`/init`
`claude --resume`	Resumes a previous session with its context	`claude --resume session_abc123`
`claude -p "prompt"`	Executes a stateless prompt (headless)	`claude -p "list the tests"`
`Esc` (2x)	Cancels the current generation to save tokens	`Double-tap Esc`

To find all available slash commands, consult the essential slash commands cheatsheet that details each shortcut.

Key takeaway: /compact and /clear are your two main levers - the first preserves the summary, the second starts from scratch.

How does the 200k-token context window work?

The context window is the model's working memory. It contains everything Claude Code sees to generate a response: the system prompt, files read, your conversation history, and tool results.

Segment	Typical size	Content
System prompt	5,000-12,000 tokens	Instructions, CLAUDE.md, available tools
Loaded files	500-80,000 tokens	Source code read via `Read`, `Grep`, `Glob`
Conversation history	2,000-100,000 tokens	User messages + previous responses
Tool results	1,000-50,000 tokens	Bash outputs, search results
Generated response	1,000-8,000 tokens	The response being generated

In practice, a 500-line TypeScript file consumes about 4,000 tokens. A 200-line git diff takes up about 1,600 tokens. The project's CLAUDE.md file consumes between 500 and 3,000 tokens depending on its size.

When you reach 80% of the window (160,000 tokens), Claude Code automatically triggers compaction. You can track consumption by observing the cost indicator in the prompt. To understand this mechanism in depth, explore the complete context management tutorial.

Key takeaway: 200k tokens seem vast, but a large file plus a long history can saturate the window in under 10 exchanges.

How to optimize context for precise responses?

Here are concrete strategies to keep a clean and relevant context. Each technique reduces noise and improves response quality.

Target loaded files

Avoid loading entire files when you only need a section. Use the offset and limit parameters of the Read tool:

# Instead of reading the whole file (2000 lines = ~16000 tokens)
# Target the relevant section
# Read with offset=150, limit=50 -> only 400 tokens

Formulate concise prompts

Reduce the size of your messages. A 200-word prompt consumes about 300 tokens. A well-targeted 50-word prompt often gets better results.

# Bad: Verbose prompt (~300 tokens)
"Can you look at the file src/auth.ts and tell me if there are security issues, especially regarding JWT token validation, session management..."

# Good: Targeted prompt (~50 tokens)
"Audit src/auth.ts: JWT vulnerabilities and sessions"

Use /compact with targeted instructions

The /compact command accepts a text argument. Specify exactly what you want to preserve:

/compact keep the database schema and modified endpoints

To discover other shortcuts that speed up your workflow, consult the first conversations cheatsheet with Claude Code.

Strategy	Estimated gain	When to use
Targeted prompt	40-60% fewer tokens	Every message
Targeted `/compact`	Recovers 70-90% of context	After 8-10 exchanges
`/clear` + resume	100% context freed	Topic change
Partial file read	50-80% fewer tokens	Files > 200 lines
Well-structured CLAUDE.md	Reduces re-explanations	Initial setup

In practice, a developer applying these techniques maintains an effective context for 25 to 40 exchanges instead of 10 to 15 without optimization.

Key takeaway: target your reads, compact regularly, and formulate short prompts - these three habits triple your context autonomy.

Why use Plan mode to save tokens?

Plan mode is an operating mode where Claude Code thinks and explores without executing actions. It consumes fewer tokens because it does not call costly tools (no bash, no file editing).

Aspect	Normal Mode	Plan Mode
Available tools	All (Read, Edit, Bash...)	Read-only (Read, Grep, Glob)
Token consumption/turn	3,000-15,000	1,000-4,000
Primary use	Implement, modify, execute	Plan, explore, analyze
Shortcut	-	`Shift+Tab`

When to activate Plan mode?

Activate Plan mode in these situations:

You are exploring an unfamiliar codebase
You are planning a multi-file refactoring
You are evaluating multiple approaches before coding
You want an action plan before writing code

# Switch to Plan mode
Shift+Tab

# Ask for an exploration
"Analyze the architecture of the src/api/ folder and propose a refactoring plan"

# Switch back to Normal mode to implement
Shift+Tab

Plan mode reduces token consumption by 60 to 75% compared to normal mode for exploration phases. The complete context management guide details advanced Plan mode use cases.

To go further on optimizing your workflows, SFEIR Institute offers a Claude Code training over one day. You will practice context management, Plan mode, and optimization strategies in supervised labs.

Key takeaway: Plan mode (Shift+Tab) divides your consumption by 3 during exploration phases - use it systematically before coding.

How does automatic compaction and PreCompact hooks work?

Automatic compaction triggers when the conversation reaches about 80% of the context window (approximately 160,000 tokens). Claude Code then summarizes the history to free up space.

The compaction process

Claude Code detects that the 80% threshold is reached
It generates a structured summary of the conversation
The full history is replaced by this summary
The conversation continues with the summary as a base

In practice, compaction reduces history from 120,000 tokens to approximately 8,000-12,000 tokens, a 90% reduction.

Configuring a PreCompact hook

PreCompact hooks let you execute code before each compaction. Configure them in your .claude/settings.json file:

{
  "hooks": {
    "PreCompact": [
      {
        "command": "echo '=== CRITICAL CONTEXT ===' && cat .claude/context-notes.md",
        "timeout": 5000
      }
    ]
  }
}

This hook injects your critical context notes into the compaction summary. In practice, this ensures that certain information survives each compaction cycle.

Compaction commands

Command	Behavior	Context preserved
`/compact`	Immediate manual compaction	Global summary
`/compact focus auth`	Themed targeted compaction	Summary focused on auth
Auto compaction (80%)	Automatic trigger	Global summary
PreCompact hook	Code executed before compaction	Hook data added

To configure advanced hooks, consult the Git integration cheatsheet that shows hook examples in different contexts. You can also consult the context management FAQ for common questions about compaction.

Key takeaway: automatic compaction is your safety net - PreCompact hooks are your way of controlling what survives the summary.

How to scale with multi-sessions and horizontal parallelism?

When a single 200k-token context is not enough, distribute work across multiple parallel Claude Code sessions. This is horizontal scaling for AI-assisted development.

Launching parallel sessions

# Terminal 1: backend session
claude --session backend-api

# Terminal 2: frontend session
claude --session frontend-ui

# Terminal 3: tests session
claude --session test-suite

Each session has its own 200k-token window. Three parallel sessions provide 600,000 tokens of total context.

Orchestrating with headless mode

For automated tasks, use headless mode that runs Claude Code without an interactive interface:

# Launch an audit in the background
claude -p "Audit all src/**/*.ts files for XSS vulnerabilities" --output-format json > audit.json

# Launch multiple tasks in parallel
claude -p "Fix types in src/models/" &
claude -p "Add missing tests in tests/" &
wait

To leverage headless mode in CI/CD, the headless mode and CI/CD cheatsheet provides ready-to-use pipelines.

Approach	Available tokens	Use case
Single session	200,000	Targeted task, single file
2 parallel sessions	400,000	Frontend + backend separated
3+ parallel sessions	600,000+	Multi-component project
Headless mode pipeline	Unlimited (sequential)	CI/CD, automated audits

multi-session mode improves productivity by 40% on projects involving more than 5 files simultaneously. In practice, 85% of developers who adopt multi-sessions reduce their refactoring time by 30 to 50%.

Key takeaway: open one session per functional domain - each session gets 100% of the context window without interference.

What keyboard shortcuts speed up context management?

Here is the complete reference of context-related shortcuts in Claude Code.

Shortcut	Action	Impact on context
`Shift+Tab`	Toggles Plan/Normal mode	Reduces consumption by 60-75%
`Esc` (1x)	Interrupts current generation	Stops consumption immediately
`Esc` (2x)	Cancels the complete turn	Saves response tokens
`Ctrl+C`	Quits Claude Code	Frees all resources
`Up arrow`	Recalls the last message	Avoids retyping (0 extra tokens)
`Tab`	Accepts the proposed completion	Does not add prompt tokens

To master all commands and shortcuts, the installation and first launch cheatsheet covers initial shortcut configuration.

If you want to go beyond this cheatsheet, SFEIR Institute offers the AI-Augmented Developer training over 2 days. You will learn to orchestrate multiple agents, optimize your context pipelines, and integrate Claude Code into your team workflows. For experienced profiles, the AI-Augmented Developer - Advanced one-day training deepens multi-session scaling and custom hooks.

Key takeaway: Shift+Tab and Esc (double-tap) are the two shortcuts that impact your context budget the most.

What common mistakes waste context?

Avoid these frequent pitfalls that consume tokens unnecessarily.

Mistake	Token cost	Solution
Loading a 2,000-line file entirely	~16,000 tokens	Target with offset/limit
Repeating the same rephrased question	~600 tokens/message	Compact before rephrasing
Never using `/compact`	Saturation in 10 exchanges	Compact every 8-10 interactions
Ignoring Plan mode to explore	3x more tokens	Switch to Plan mode with `Shift+Tab`
Doing everything in a single session	100% polluted context	Separate into thematic sessions
Pasting complete logs into the prompt	5,000-50,000 tokens	Filter logs before pasting

In practice, 70% of context overflows come from files loaded without filtering. A single package-lock.json file can consume 80,000 tokens on its own.

To identify and fix these mistakes in your daily usage, consult the common context management mistakes guide. You can also explore the MCP protocol capabilities for externalizing certain data outside the main context.

Key takeaway: a single poorly targeted file can consume 40% of your window - always check the size before loading.

How to set up a daily context management workflow?

Here is a typical workflow for a development day with Claude Code, optimized for context management.

Startup sequence

Launch Claude Code in the project directory: claude
Verify the CLAUDE.md file is up to date: /init
Activate Plan mode to explore: Shift+Tab
Formulate your objective in a single targeted sentence

Work sequence

Explore in Plan mode (read-only, token savings)
Switch to Normal mode to implement: Shift+Tab
Compact every 8 to 10 interactions: /compact
Separate long tasks into dedicated sessions

End-of-day sequence

Compact one last time with instructions: /compact summary of today's changes
Note the session ID for resumption: visible in the prompt
Resume the next day: claude --resume

# Complete workflow in commands
claude                          # 1. Start
/init                           # 2. Initialize CLAUDE.md
# Shift+Tab                    # 3. Plan mode
# ... explore and plan ...
# Shift+Tab                    # 4. Normal mode
# ... implement ...
/compact keep auth modifications  # 5. Compact
# ... continue ...
/compact final summary           # 6. End of day

In practice, this workflow maintains optimal context over a full 8-hour day with 40 to 60 interactions. To dive deeper into each step, the context management quick reference centralizes all resources.

Key takeaway: start in Plan mode, compact regularly, separate domains into sessions - these three principles cover 90% of needs.

Content written by SFEIR Institute - IT training organization specialized in cloud and AI technologies. Find our trainings at sfeir.com.