Cheatsheet11 min read

Context Management - Cheatsheet

SFEIR Institute

TL;DR

Context management in Claude Code determines the quality of every generated response. This cheatsheet covers the anatomy of the 200k-token window, optimization strategies, Plan mode, automatic compaction, and multi-session scaling. Keep this practical sheet handy to master your context consumption daily.

Context management in Claude Code determines the quality of every generated response. This cheatsheet covers the anatomy of the 200k-token window, optimization strategies, Plan mode, automatic compaction, and multi-session scaling. Keep this practical sheet handy to master your context consumption daily.

Context management in Claude Code is the discipline of maximizing response relevance by controlling what the model sees in its 200,000-token window. Claude Code (version 1.0.33) uses the Claude Sonnet 4 model with a 200k-token context window, roughly 150,000 words.

According to the Anthropic documentation (2025), a token represents on average 3.5 characters in French. Mastering this finite resource is the most profitable skill for any developer using Claude Code.

What are the essential commands for managing context?

Here is the quick reference table of the most frequently used commands. Each row is self-contained and citable.

CommandDescriptionExample
/compactCompacts the conversation by summarizing historyType /compact in the prompt
/compact [instructions]Compacts with custom focus instructions/compact keep only the modified code
/clearClears all context and starts freshType /clear
Shift+TabToggles between normal mode and Plan modePress Shift+Tab
/initGenerates a CLAUDE.md file for the project/init
claude --resumeResumes a previous session with its contextclaude --resume session_abc123
claude -p "prompt"Executes a stateless prompt (headless)claude -p "list the tests"
Esc (2x)Cancels the current generation to save tokensDouble-tap Esc

To find all available slash commands, consult the essential slash commands cheatsheet that details each shortcut.

Key takeaway: /compact and /clear are your two main levers - the first preserves the summary, the second starts from scratch.

How does the 200k-token context window work?

The context window is the model's working memory. It contains everything Claude Code sees to generate a response: the system prompt, files read, your conversation history, and tool results.

SegmentTypical sizeContent
System prompt5,000-12,000 tokensInstructions, CLAUDE.md, available tools
Loaded files500-80,000 tokensSource code read via Read, Grep, Glob
Conversation history2,000-100,000 tokensUser messages + previous responses
Tool results1,000-50,000 tokensBash outputs, search results
Generated response1,000-8,000 tokensThe response being generated

In practice, a 500-line TypeScript file consumes about 4,000 tokens. A 200-line git diff takes up about 1,600 tokens. The project's CLAUDE.md file consumes between 500 and 3,000 tokens depending on its size.

When you reach 80% of the window (160,000 tokens), Claude Code automatically triggers compaction. You can track consumption by observing the cost indicator in the prompt. To understand this mechanism in depth, explore the complete context management tutorial.

Key takeaway: 200k tokens seem vast, but a large file plus a long history can saturate the window in under 10 exchanges.

How to optimize context for precise responses?

Here are concrete strategies to keep a clean and relevant context. Each technique reduces noise and improves response quality.

Target loaded files

Avoid loading entire files when you only need a section. Use the offset and limit parameters of the Read tool:

# Instead of reading the whole file (2000 lines = ~16000 tokens)
# Target the relevant section
# Read with offset=150, limit=50 -> only 400 tokens

Formulate concise prompts

Reduce the size of your messages. A 200-word prompt consumes about 300 tokens. A well-targeted 50-word prompt often gets better results.

# Bad: Verbose prompt (~300 tokens)
"Can you look at the file src/auth.ts and tell me if there are security issues, especially regarding JWT token validation, session management..."

# Good: Targeted prompt (~50 tokens)
"Audit src/auth.ts: JWT vulnerabilities and sessions"

Use /compact with targeted instructions

The /compact command accepts a text argument. Specify exactly what you want to preserve:

/compact keep the database schema and modified endpoints

To discover other shortcuts that speed up your workflow, consult the first conversations cheatsheet with Claude Code.

StrategyEstimated gainWhen to use
Targeted prompt40-60% fewer tokensEvery message
Targeted /compactRecovers 70-90% of contextAfter 8-10 exchanges
/clear + resume100% context freedTopic change
Partial file read50-80% fewer tokensFiles > 200 lines
Well-structured CLAUDE.mdReduces re-explanationsInitial setup

In practice, a developer applying these techniques maintains an effective context for 25 to 40 exchanges instead of 10 to 15 without optimization.

Key takeaway: target your reads, compact regularly, and formulate short prompts - these three habits triple your context autonomy.

Why use Plan mode to save tokens?

Plan mode is an operating mode where Claude Code thinks and explores without executing actions. It consumes fewer tokens because it does not call costly tools (no bash, no file editing).

AspectNormal ModePlan Mode
Available toolsAll (Read, Edit, Bash...)Read-only (Read, Grep, Glob)
Token consumption/turn3,000-15,0001,000-4,000
Primary useImplement, modify, executePlan, explore, analyze
Shortcut-Shift+Tab

When to activate Plan mode?

Activate Plan mode in these situations:

  1. You are exploring an unfamiliar codebase
  2. You are planning a multi-file refactoring
  3. You are evaluating multiple approaches before coding
  4. You want an action plan before writing code
# Switch to Plan mode
Shift+Tab

# Ask for an exploration
"Analyze the architecture of the src/api/ folder and propose a refactoring plan"

# Switch back to Normal mode to implement
Shift+Tab

Plan mode reduces token consumption by 60 to 75% compared to normal mode for exploration phases. The complete context management guide details advanced Plan mode use cases.

To go further on optimizing your workflows, SFEIR Institute offers a Claude Code training over one day. You will practice context management, Plan mode, and optimization strategies in supervised labs.

Key takeaway: Plan mode (Shift+Tab) divides your consumption by 3 during exploration phases - use it systematically before coding.

How does automatic compaction and PreCompact hooks work?

Automatic compaction triggers when the conversation reaches about 80% of the context window (approximately 160,000 tokens). Claude Code then summarizes the history to free up space.

The compaction process

  1. Claude Code detects that the 80% threshold is reached
  2. It generates a structured summary of the conversation
  3. The full history is replaced by this summary
  4. The conversation continues with the summary as a base

In practice, compaction reduces history from 120,000 tokens to approximately 8,000-12,000 tokens, a 90% reduction.

Configuring a PreCompact hook

PreCompact hooks let you execute code before each compaction. Configure them in your .claude/settings.json file:

{
  "hooks": {
    "PreCompact": [
      {
        "command": "echo '=== CRITICAL CONTEXT ===' && cat .claude/context-notes.md",
        "timeout": 5000
      }
    ]
  }
}

This hook injects your critical context notes into the compaction summary. In practice, this ensures that certain information survives each compaction cycle.

Compaction commands

CommandBehaviorContext preserved
/compactImmediate manual compactionGlobal summary
/compact focus authThemed targeted compactionSummary focused on auth
Auto compaction (80%)Automatic triggerGlobal summary
PreCompact hookCode executed before compactionHook data added

To configure advanced hooks, consult the Git integration cheatsheet that shows hook examples in different contexts. You can also consult the context management FAQ for common questions about compaction.

Key takeaway: automatic compaction is your safety net - PreCompact hooks are your way of controlling what survives the summary.

How to scale with multi-sessions and horizontal parallelism?

When a single 200k-token context is not enough, distribute work across multiple parallel Claude Code sessions. This is horizontal scaling for AI-assisted development.

Launching parallel sessions

# Terminal 1: backend session
claude --session backend-api

# Terminal 2: frontend session
claude --session frontend-ui

# Terminal 3: tests session
claude --session test-suite

Each session has its own 200k-token window. Three parallel sessions provide 600,000 tokens of total context.

Orchestrating with headless mode

For automated tasks, use headless mode that runs Claude Code without an interactive interface:

# Launch an audit in the background
claude -p "Audit all src/**/*.ts files for XSS vulnerabilities" --output-format json > audit.json

# Launch multiple tasks in parallel
claude -p "Fix types in src/models/" &
claude -p "Add missing tests in tests/" &
wait

To leverage headless mode in CI/CD, the headless mode and CI/CD cheatsheet provides ready-to-use pipelines.

ApproachAvailable tokensUse case
Single session200,000Targeted task, single file
2 parallel sessions400,000Frontend + backend separated
3+ parallel sessions600,000+Multi-component project
Headless mode pipelineUnlimited (sequential)CI/CD, automated audits

multi-session mode improves productivity by 40% on projects involving more than 5 files simultaneously. In practice, 85% of developers who adopt multi-sessions reduce their refactoring time by 30 to 50%.

Key takeaway: open one session per functional domain - each session gets 100% of the context window without interference.

What keyboard shortcuts speed up context management?

Here is the complete reference of context-related shortcuts in Claude Code.

ShortcutActionImpact on context
Shift+TabToggles Plan/Normal modeReduces consumption by 60-75%
Esc (1x)Interrupts current generationStops consumption immediately
Esc (2x)Cancels the complete turnSaves response tokens
Ctrl+CQuits Claude CodeFrees all resources
Up arrowRecalls the last messageAvoids retyping (0 extra tokens)
TabAccepts the proposed completionDoes not add prompt tokens

To master all commands and shortcuts, the installation and first launch cheatsheet covers initial shortcut configuration.

If you want to go beyond this cheatsheet, SFEIR Institute offers the AI-Augmented Developer training over 2 days. You will learn to orchestrate multiple agents, optimize your context pipelines, and integrate Claude Code into your team workflows. For experienced profiles, the AI-Augmented Developer - Advanced one-day training deepens multi-session scaling and custom hooks.

Key takeaway: Shift+Tab and Esc (double-tap) are the two shortcuts that impact your context budget the most.

What common mistakes waste context?

Avoid these frequent pitfalls that consume tokens unnecessarily.

MistakeToken costSolution
Loading a 2,000-line file entirely~16,000 tokensTarget with offset/limit
Repeating the same rephrased question~600 tokens/messageCompact before rephrasing
Never using /compactSaturation in 10 exchangesCompact every 8-10 interactions
Ignoring Plan mode to explore3x more tokensSwitch to Plan mode with Shift+Tab
Doing everything in a single session100% polluted contextSeparate into thematic sessions
Pasting complete logs into the prompt5,000-50,000 tokensFilter logs before pasting

In practice, 70% of context overflows come from files loaded without filtering. A single package-lock.json file can consume 80,000 tokens on its own.

To identify and fix these mistakes in your daily usage, consult the common context management mistakes guide. You can also explore the MCP protocol capabilities for externalizing certain data outside the main context.

Key takeaway: a single poorly targeted file can consume 40% of your window - always check the size before loading.

How to set up a daily context management workflow?

Here is a typical workflow for a development day with Claude Code, optimized for context management.

Startup sequence

  1. Launch Claude Code in the project directory: claude
  2. Verify the CLAUDE.md file is up to date: /init
  3. Activate Plan mode to explore: Shift+Tab
  4. Formulate your objective in a single targeted sentence

Work sequence

  1. Explore in Plan mode (read-only, token savings)
  2. Switch to Normal mode to implement: Shift+Tab
  3. Compact every 8 to 10 interactions: /compact
  4. Separate long tasks into dedicated sessions

End-of-day sequence

  1. Compact one last time with instructions: /compact summary of today's changes
  2. Note the session ID for resumption: visible in the prompt
  3. Resume the next day: claude --resume
# Complete workflow in commands
claude                          # 1. Start
/init                           # 2. Initialize CLAUDE.md
# Shift+Tab                    # 3. Plan mode
# ... explore and plan ...
# Shift+Tab                    # 4. Normal mode
# ... implement ...
/compact keep auth modifications  # 5. Compact
# ... continue ...
/compact final summary           # 6. End of day

In practice, this workflow maintains optimal context over a full 8-hour day with 40 to 60 interactions. To dive deeper into each step, the context management quick reference centralizes all resources.

Key takeaway: start in Plan mode, compact regularly, separate domains into sessions - these three principles cover 90% of needs.


Content written by SFEIR Institute - IT training organization specialized in cloud and AI technologies. Find our trainings at sfeir.com.

Recommended training

Claude Code Training

Master Claude Code with our expert instructors. Practical, hands-on training directly applicable to your projects.

View program