TL;DR
Context management in Claude Code determines the quality of every generated response. This cheatsheet covers the anatomy of the 200k-token window, optimization strategies, Plan mode, automatic compaction, and multi-session scaling. Keep this practical sheet handy to master your context consumption daily.
Context management in Claude Code determines the quality of every generated response. This cheatsheet covers the anatomy of the 200k-token window, optimization strategies, Plan mode, automatic compaction, and multi-session scaling. Keep this practical sheet handy to master your context consumption daily.
Context management in Claude Code is the discipline of maximizing response relevance by controlling what the model sees in its 200,000-token window. Claude Code (version 1.0.33) uses the Claude Sonnet 4 model with a 200k-token context window, roughly 150,000 words.
According to the Anthropic documentation (2025), a token represents on average 3.5 characters in French. Mastering this finite resource is the most profitable skill for any developer using Claude Code.
What are the essential commands for managing context?
Here is the quick reference table of the most frequently used commands. Each row is self-contained and citable.
| Command | Description | Example |
|---|---|---|
/compact | Compacts the conversation by summarizing history | Type /compact in the prompt |
/compact [instructions] | Compacts with custom focus instructions | /compact keep only the modified code |
/clear | Clears all context and starts fresh | Type /clear |
Shift+Tab | Toggles between normal mode and Plan mode | Press Shift+Tab |
/init | Generates a CLAUDE.md file for the project | /init |
claude --resume | Resumes a previous session with its context | claude --resume session_abc123 |
claude -p "prompt" | Executes a stateless prompt (headless) | claude -p "list the tests" |
Esc (2x) | Cancels the current generation to save tokens | Double-tap Esc |
To find all available slash commands, consult the essential slash commands cheatsheet that details each shortcut.
Key takeaway: /compact and /clear are your two main levers - the first preserves the summary, the second starts from scratch.
How does the 200k-token context window work?
The context window is the model's working memory. It contains everything Claude Code sees to generate a response: the system prompt, files read, your conversation history, and tool results.
| Segment | Typical size | Content |
|---|---|---|
| System prompt | 5,000-12,000 tokens | Instructions, CLAUDE.md, available tools |
| Loaded files | 500-80,000 tokens | Source code read via Read, Grep, Glob |
| Conversation history | 2,000-100,000 tokens | User messages + previous responses |
| Tool results | 1,000-50,000 tokens | Bash outputs, search results |
| Generated response | 1,000-8,000 tokens | The response being generated |
In practice, a 500-line TypeScript file consumes about 4,000 tokens. A 200-line git diff takes up about 1,600 tokens. The project's CLAUDE.md file consumes between 500 and 3,000 tokens depending on its size.
When you reach 80% of the window (160,000 tokens), Claude Code automatically triggers compaction. You can track consumption by observing the cost indicator in the prompt. To understand this mechanism in depth, explore the complete context management tutorial.
Key takeaway: 200k tokens seem vast, but a large file plus a long history can saturate the window in under 10 exchanges.
How to optimize context for precise responses?
Here are concrete strategies to keep a clean and relevant context. Each technique reduces noise and improves response quality.
Target loaded files
Avoid loading entire files when you only need a section. Use the offset and limit parameters of the Read tool:
# Instead of reading the whole file (2000 lines = ~16000 tokens)
# Target the relevant section
# Read with offset=150, limit=50 -> only 400 tokens
Formulate concise prompts
Reduce the size of your messages. A 200-word prompt consumes about 300 tokens. A well-targeted 50-word prompt often gets better results.
# Bad: Verbose prompt (~300 tokens)
"Can you look at the file src/auth.ts and tell me if there are security issues, especially regarding JWT token validation, session management..."
# Good: Targeted prompt (~50 tokens)
"Audit src/auth.ts: JWT vulnerabilities and sessions"
Use /compact with targeted instructions
The /compact command accepts a text argument. Specify exactly what you want to preserve:
/compact keep the database schema and modified endpoints
To discover other shortcuts that speed up your workflow, consult the first conversations cheatsheet with Claude Code.
| Strategy | Estimated gain | When to use |
|---|---|---|
| Targeted prompt | 40-60% fewer tokens | Every message |
Targeted /compact | Recovers 70-90% of context | After 8-10 exchanges |
/clear + resume | 100% context freed | Topic change |
| Partial file read | 50-80% fewer tokens | Files > 200 lines |
| Well-structured CLAUDE.md | Reduces re-explanations | Initial setup |
In practice, a developer applying these techniques maintains an effective context for 25 to 40 exchanges instead of 10 to 15 without optimization.
Key takeaway: target your reads, compact regularly, and formulate short prompts - these three habits triple your context autonomy.
Why use Plan mode to save tokens?
Plan mode is an operating mode where Claude Code thinks and explores without executing actions. It consumes fewer tokens because it does not call costly tools (no bash, no file editing).
| Aspect | Normal Mode | Plan Mode |
|---|---|---|
| Available tools | All (Read, Edit, Bash...) | Read-only (Read, Grep, Glob) |
| Token consumption/turn | 3,000-15,000 | 1,000-4,000 |
| Primary use | Implement, modify, execute | Plan, explore, analyze |
| Shortcut | - | Shift+Tab |
When to activate Plan mode?
Activate Plan mode in these situations:
- You are exploring an unfamiliar codebase
- You are planning a multi-file refactoring
- You are evaluating multiple approaches before coding
- You want an action plan before writing code
# Switch to Plan mode
Shift+Tab
# Ask for an exploration
"Analyze the architecture of the src/api/ folder and propose a refactoring plan"
# Switch back to Normal mode to implement
Shift+Tab
Plan mode reduces token consumption by 60 to 75% compared to normal mode for exploration phases. The complete context management guide details advanced Plan mode use cases.
To go further on optimizing your workflows, SFEIR Institute offers a Claude Code training over one day. You will practice context management, Plan mode, and optimization strategies in supervised labs.
Key takeaway: Plan mode (Shift+Tab) divides your consumption by 3 during exploration phases - use it systematically before coding.
How does automatic compaction and PreCompact hooks work?
Automatic compaction triggers when the conversation reaches about 80% of the context window (approximately 160,000 tokens). Claude Code then summarizes the history to free up space.
The compaction process
- Claude Code detects that the 80% threshold is reached
- It generates a structured summary of the conversation
- The full history is replaced by this summary
- The conversation continues with the summary as a base
In practice, compaction reduces history from 120,000 tokens to approximately 8,000-12,000 tokens, a 90% reduction.
Configuring a PreCompact hook
PreCompact hooks let you execute code before each compaction. Configure them in your .claude/settings.json file:
{
"hooks": {
"PreCompact": [
{
"command": "echo '=== CRITICAL CONTEXT ===' && cat .claude/context-notes.md",
"timeout": 5000
}
]
}
}
This hook injects your critical context notes into the compaction summary. In practice, this ensures that certain information survives each compaction cycle.
Compaction commands
| Command | Behavior | Context preserved |
|---|---|---|
/compact | Immediate manual compaction | Global summary |
/compact focus auth | Themed targeted compaction | Summary focused on auth |
| Auto compaction (80%) | Automatic trigger | Global summary |
| PreCompact hook | Code executed before compaction | Hook data added |
To configure advanced hooks, consult the Git integration cheatsheet that shows hook examples in different contexts. You can also consult the context management FAQ for common questions about compaction.
Key takeaway: automatic compaction is your safety net - PreCompact hooks are your way of controlling what survives the summary.
How to scale with multi-sessions and horizontal parallelism?
When a single 200k-token context is not enough, distribute work across multiple parallel Claude Code sessions. This is horizontal scaling for AI-assisted development.
Launching parallel sessions
# Terminal 1: backend session
claude --session backend-api
# Terminal 2: frontend session
claude --session frontend-ui
# Terminal 3: tests session
claude --session test-suite
Each session has its own 200k-token window. Three parallel sessions provide 600,000 tokens of total context.
Orchestrating with headless mode
For automated tasks, use headless mode that runs Claude Code without an interactive interface:
# Launch an audit in the background
claude -p "Audit all src/**/*.ts files for XSS vulnerabilities" --output-format json > audit.json
# Launch multiple tasks in parallel
claude -p "Fix types in src/models/" &
claude -p "Add missing tests in tests/" &
wait
To leverage headless mode in CI/CD, the headless mode and CI/CD cheatsheet provides ready-to-use pipelines.
| Approach | Available tokens | Use case |
|---|---|---|
| Single session | 200,000 | Targeted task, single file |
| 2 parallel sessions | 400,000 | Frontend + backend separated |
| 3+ parallel sessions | 600,000+ | Multi-component project |
| Headless mode pipeline | Unlimited (sequential) | CI/CD, automated audits |
multi-session mode improves productivity by 40% on projects involving more than 5 files simultaneously. In practice, 85% of developers who adopt multi-sessions reduce their refactoring time by 30 to 50%.
Key takeaway: open one session per functional domain - each session gets 100% of the context window without interference.
What keyboard shortcuts speed up context management?
Here is the complete reference of context-related shortcuts in Claude Code.
| Shortcut | Action | Impact on context |
|---|---|---|
Shift+Tab | Toggles Plan/Normal mode | Reduces consumption by 60-75% |
Esc (1x) | Interrupts current generation | Stops consumption immediately |
Esc (2x) | Cancels the complete turn | Saves response tokens |
Ctrl+C | Quits Claude Code | Frees all resources |
Up arrow | Recalls the last message | Avoids retyping (0 extra tokens) |
Tab | Accepts the proposed completion | Does not add prompt tokens |
To master all commands and shortcuts, the installation and first launch cheatsheet covers initial shortcut configuration.
If you want to go beyond this cheatsheet, SFEIR Institute offers the AI-Augmented Developer training over 2 days. You will learn to orchestrate multiple agents, optimize your context pipelines, and integrate Claude Code into your team workflows. For experienced profiles, the AI-Augmented Developer - Advanced one-day training deepens multi-session scaling and custom hooks.
Key takeaway: Shift+Tab and Esc (double-tap) are the two shortcuts that impact your context budget the most.
What common mistakes waste context?
Avoid these frequent pitfalls that consume tokens unnecessarily.
| Mistake | Token cost | Solution |
|---|---|---|
| Loading a 2,000-line file entirely | ~16,000 tokens | Target with offset/limit |
| Repeating the same rephrased question | ~600 tokens/message | Compact before rephrasing |
Never using /compact | Saturation in 10 exchanges | Compact every 8-10 interactions |
| Ignoring Plan mode to explore | 3x more tokens | Switch to Plan mode with Shift+Tab |
| Doing everything in a single session | 100% polluted context | Separate into thematic sessions |
| Pasting complete logs into the prompt | 5,000-50,000 tokens | Filter logs before pasting |
In practice, 70% of context overflows come from files loaded without filtering. A single package-lock.json file can consume 80,000 tokens on its own.
To identify and fix these mistakes in your daily usage, consult the common context management mistakes guide. You can also explore the MCP protocol capabilities for externalizing certain data outside the main context.
Key takeaway: a single poorly targeted file can consume 40% of your window - always check the size before loading.
How to set up a daily context management workflow?
Here is a typical workflow for a development day with Claude Code, optimized for context management.
Startup sequence
- Launch Claude Code in the project directory:
claude - Verify the CLAUDE.md file is up to date:
/init - Activate Plan mode to explore:
Shift+Tab - Formulate your objective in a single targeted sentence
Work sequence
- Explore in Plan mode (read-only, token savings)
- Switch to Normal mode to implement:
Shift+Tab - Compact every 8 to 10 interactions:
/compact - Separate long tasks into dedicated sessions
End-of-day sequence
- Compact one last time with instructions:
/compact summary of today's changes - Note the session ID for resumption: visible in the prompt
- Resume the next day:
claude --resume
# Complete workflow in commands
claude # 1. Start
/init # 2. Initialize CLAUDE.md
# Shift+Tab # 3. Plan mode
# ... explore and plan ...
# Shift+Tab # 4. Normal mode
# ... implement ...
/compact keep auth modifications # 5. Compact
# ... continue ...
/compact final summary # 6. End of day
In practice, this workflow maintains optimal context over a full 8-hour day with 40 to 60 interactions. To dive deeper into each step, the context management quick reference centralizes all resources.
Key takeaway: start in Plan mode, compact regularly, separate domains into sessions - these three principles cover 90% of needs.
Content written by SFEIR Institute - IT training organization specialized in cloud and AI technologies. Find our trainings at sfeir.com.
Claude Code Training
Master Claude Code with our expert instructors. Practical, hands-on training directly applicable to your projects.
View program