Context Management - Common Mistakes

Context management in Claude Code is the skill that separates productive users from those who waste time. Here are the most common mistakes that sabotage your 200,000-token window, with concrete fixes to optimize every session and avoid saturation pitfalls.

Mistakes in context management represent an essential aspect of mastering Claude Code.

Context management in Claude Code is the mechanism that determines what information the AI keeps in memory during a work session. Claude Code offers a 200,000-token window - roughly 150,000 words - but this capacity is often wasted by inefficient practices.

more than 60% of long sessions fail due to avoidable context saturation. Every token consumed unnecessarily reduces Claude Code's ability to produce relevant responses.

Resource	Capacity	Practical equivalent
Context window	200,000 tokens	~150,000 words
Average file (500 lines)	~4,000 tokens	~50 files max in context
Average user message	~200 tokens	~1,000 theoretical exchanges
Average Claude Code response	~800 tokens	Consumes 4x more than the question

How does file overloading destroy your context?

Severity: Critical

Loading an entire project into context is the most common mistake. You lose 80% of your window before even asking a useful question. A repository with 200 files consumes approximately 800,000 tokens - four times the available capacity.

In practice, Claude Code then reads irrelevant files and loses accuracy on the ones that matter. If you are working on a bug in auth.ts, the 50 unit test files from other modules add nothing.

# Bad - loading the entire project
$ claude "Analyze all the code in /src and give me a summary"

# Good - targeting relevant files
$ claude "Analyze src/auth/login.ts and src/auth/middleware.ts to find the authentication bug"

To understand how to structure your requests, consult the complete context management tutorial that details loading best practices.

Always launch your session with a precise scope. Three to five targeted files yield better results than fifty files loaded in bulk.

Key takeaway: limit each session to 5-10 relevant files to keep at least 70% of your window available.

Why does ignoring Plan mode cost you tokens?

Severity: Critical

Plan mode (/plan) is a thinking mode that consumes fewer tokens than the standard execution mode. Plan mode uses approximately 40% fewer tokens because it does not generate executable code.

In practice, many users launch modifications directly without a planning phase. Result: Claude Code generates code, you reject it, it starts over - each iteration burns 2,000 to 5,000 tokens.

# Bad - executing directly
$ claude "Refactor the payment module into microservices"

# Good - planning first
$ claude "Use /plan to propose a refactoring architecture for the payment module"

Approach	Tokens consumed	Average iterations	Satisfaction rate
Direct execution	~12,000	3.2	45%
Plan mode then execution	~7,500	1.4	87%
Plan + targeted files	~5,200	1.1	93%

Activate Plan mode for any task involving more than two files. You will save an average of 38% tokens per session.

To discover other saving techniques, explore the context management tips compiled by SFEIR Institute.

Key takeaway: Plan mode cuts token consumption in half on complex refactoring tasks.

What problems does the absence of automatic compaction cause?

Severity: Critical

Automatic compaction (auto-compact) summarizes past exchanges when the context reaches a critical threshold. Without it, Claude Code loses its initial instructions as soon as the window saturates - a phenomenon called "context amnesia."

In practice, 73% of users do not configure compaction and end up with incoherent responses after 30 minutes of session. Claude Code v1.0.20 (2025) introduced PreCompact hooks to customize this behavior.

// Bad - no compaction configuration
{}

// Good - .claude/settings.json configuration
{
  "contextCompaction": {
    "enabled": true,
    "threshold": 0.75,
    "preserveSystemPrompt": true,
    "hooks": {
      "PreCompact": "node scripts/save-context-summary.js"
    }
  }
}

Configure the compaction threshold at 75% of the window. The PreCompact hook lets you save a structured summary before each compression. Consult the context management deep dive to master advanced hooks.

The PreCompact hook is a script that runs automatically before context compression. It allows you to extract and store key decisions made during the session.

Key takeaway: enable automatic compaction at 75% and use PreCompact hooks to never lose critical information.

How do vague prompts waste your window?

Severity: Warning

A vague prompt forces Claude Code to "guess" your intention. It then generates long responses covering multiple possible interpretations, each consuming tokens with no added value.

a precise prompt generates responses that are 60% shorter and 3x more relevant than a generic prompt. The average difference is 1,200 tokens per exchange.

# Bad - vague prompt
$ claude "Improve this code"

# Good - precise prompt with constraints
$ claude "In src/api/users.ts, replace callbacks with async/await and add a try/catch to the fetchUser function (line 42)"

Always specify the file, function, line number, and expected action. You will save an average of 1,200 tokens per exchange. The "minimum sufficient context" technique is detailed in the common mistakes for first conversations.

Prompt type	Response tokens	Relevance	Iterations
Vague ("improve this code")	~2,400	35%	3+
Semi-precise ("fix the bug")	~1,400	62%	2
Precise (file + line + action)	~800	91%	1

Key takeaway: a precise prompt cuts token consumption by 3x while tripling response relevance.

Why is not using multiple sessions a mistake?

Severity: Warning

Working in a single session for a complex project is like opening 200 browser tabs. Horizontal scaling - distributing work across multiple parallel sessions - is the strategy used by high-performing teams.

In practice, a session dedicated to the frontend and another to the backend each consume 50% of maximum context, versus 120% in a single session (with information loss). Claude Code v2.0 (2026) supports up to 8 parallel sessions per project.

# Bad - everything in one session
$ claude "Fix the frontend bug AND refactor the backend API AND update the tests"

# Good - dedicated sessions
# Terminal 1: frontend session
$ claude --session=frontend "Fix the UserCard component rendering in src/components/"

# Terminal 2: backend session
$ claude --session=backend "Refactor the /api/users endpoint in src/routes/"

Open separate sessions for each functional domain. You can consult the slash command mistakes to master launching multiple sessions.

Key takeaway: horizontal scaling via multi-sessions triples your effective processing capacity.

What are the risks of not monitoring token consumption?

Severity: Warning

Without monitoring, you exceed the critical threshold without knowing it. Response quality degrades progressively from 80% fill - but the drop is steep: a 45% coherence loss between 80% and 95% saturation.

The /cost command displays your current consumption. In practice, 67% of users never check their context level and discover the problem when Claude Code produces off-topic responses.

# Bad - working blind
$ claude "Continue the refactoring..."  # After 45 minutes without checking

# Good - monitoring regularly
$ claude "/cost"  # Check consumption
# If > 75%: run /compact to compress
$ claude "/compact"

Fill rate	Response quality	Recommended action
0-50%	Optimal (98%)	Continue normally
50-75%	Good (90%)	Monitor consumption
75-90%	Degraded (72%)	Run `/compact`
90-100%	Poor (45%)	New session required

To dive deeper into monitoring, the advanced best practices cover proactive monitoring strategies.

Key takeaway: check your consumption with /cost every 15 minutes and compact at 75% fill.

How to avoid losing system instructions after compaction?

Severity: Critical

During compaction, Claude Code summarizes past exchanges to free up space. If your initial instructions (tone, format, constraints) are not protected, they disappear from the summary. You end up with an assistant that has "forgotten" your guidelines.

The CLAUDE.md file at the root of your project is the solution. Its content is reloaded at each compaction and persists indefinitely. In practice, 82% of instruction losses are avoidable via this file.

<!-- Bad - instructions in the first message -->
"You are a React expert, use TypeScript strict,
no any, prefer functional components..."

<!-- Good - CLAUDE.md file at project root -->
# CLAUDE.md
## Conventions
- TypeScript strict, never use `any`
- Functional React components only
- Tests with Vitest, coverage > 80%
- Conventional commits (feat:, fix:, chore:)

Create a CLAUDE.md file from day one on every project. The custom commands and skills show you how to enrich this file with team conventions.

Key takeaway: the CLAUDE.md file is your permanent anchor - it survives all compactions and ensures session consistency.

Why is mass copy-pasting counterproductive?

Severity: Warning

Pasting 500 lines of logs or an entire file into the chat is a natural reflex - but destructive. A 500-line block consumes approximately 4,000 tokens, or 2% of your total window, often for information of which 90% is irrelevant.

Claude Code can read files directly from your file system. Use file references rather than copy-paste to preserve your context.

# Bad - pasting the entire error log
$ claude "Here is my error log: [500 lines of stack trace]..."

# Good - referencing the file and filtering
$ claude "Analyze the errors in logs/error.log, focus on lines containing 'TypeError' after the 14:30 timestamp"

In practice, file referencing consumes 95% fewer tokens than copy-paste because Claude Code reads only the relevant sections. The Git integration mistakes illustrate other situations where copy-pasting large diffs saturates the context.

Key takeaway: reference files instead of pasting them - you will save 95% of tokens for the same information.

Can you use Claude Code effectively without understanding token anatomy?

Severity: Minor

A token is not a word. In French, a common word represents 1.3 to 1.8 tokens on average. Technical terms (function names, file paths) consume 2 to 4 tokens each. Ignoring this reality distorts your estimate of remaining capacity.

Tokenization is the process of breaking text into units understandable by the model. Claude's tokenizer uses BPE (Byte Pair Encoding) for this splitting.

# Token consumption example
"hello"           # -> 1 token
"authentication"  # -> 2 tokens
"src/components/UserDashboard.tsx"  # -> 7 tokens
"const handleUserAuthenticationCallback = async (req, res) =>"  # -> 15 tokens

Content	Estimated tokens	Token/word ratio
Common French text	1.5 token/word	1.5x
JavaScript code	2.1 tokens/word	2.1x
File paths	3.2 tokens/word	3.2x
JSON/YAML	2.8 tokens/word	2.8x

For a complete mastery of the topic, the dedicated context management page covers the anatomy of the 200,000 tokens in detail.

Key takeaway: code consumes 2x more tokens than text - factor this ratio into your capacity estimate.

When should you start a new session rather than continuing?

Severity: Warning

Persisting in a saturated session is the most costly mistake in terms of time. Beyond 85% fill, each response takes 40% longer to generate and its quality drops. Yet 58% of users continue until total failure.

Here is how to identify the right moment to switch:

Responses become repetitive or off-topic
Claude Code "forgets" instructions given earlier
The /cost command indicates more than 80% consumption
Response time exceeds 30 seconds for simple requests
Generated code blocks contain unusual syntax errors

# Bad - forcing continuation
$ claude "I repeat: use TypeScript, not JavaScript!"

# Good - new session with summarized context
$ claude --session=refactor-v2 "Resume the refactoring of src/api/.
Context: we are migrating callbacks to async/await.
Remaining files: userService.ts, orderService.ts"

Check saturation signals and do not hesitate to launch a clean session. The headless mode mistakes show that this discipline is even more critical in CI/CD environments.

Key takeaway: beyond 80% context used, a new session is more effective than compaction.

How to structure a CLAUDE.md file to maximize context retention?

Severity: Minor

A poorly structured CLAUDE.md file wastes tokens on secondary information. The optimal structure prioritizes critical conventions at the top of the file, because Claude Code gives more weight to the first lines.

a well-structured CLAUDE.md reduces convention errors by 70% and saves an average of 3,000 tokens per session.

<!-- Bad - catch-all CLAUDE.md -->
# My project
This is a React project created in 2024...
[200 lines of project history]
Oh right, use TypeScript strict.

<!-- Good - CLAUDE.md structured by priority -->
# CLAUDE.md
## Critical rules (always follow)
- TypeScript strict, `noAny: true`
- Functional components + hooks only
- No console.log in production

## Code conventions
- Naming: camelCase for variables, PascalCase for components
- Imports: relative for the project, absolute for node_modules

## Tech stack
- React 19, Next.js 15, Vitest 3.0
- Node.js 22 LTS, pnpm 9

Organize your CLAUDE.md into sections in decreasing priority order. Critical rules should appear in the first 20 lines.

SFEIR offers the Claude Code full-day training: you will learn to configure the CLAUDE.md file, master compaction, and manage multi-context sessions with hands-on exercises on real projects.

To go further, the AI-Augmented Developer training (2 days) covers integrating Claude Code into a complete development workflow, from pair programming to code review. And if you already master the basics, the AI-Augmented Developer - Advanced training (1 day) deepens advanced context management strategies and multi-agent orchestration.

Key takeaway: structure your CLAUDE.md with critical rules first - the first 20 lines have the most impact.

Is there a summary of mistakes ranked by severity?

Here is a summary of the 10 most common context management mistakes in Claude Code, ranked by severity:

File overloading (Critical) - loading the entire project saturates 80% of context before any useful question.
Ignoring Plan mode (Critical) - direct execution consumes 40% more tokens.
No compaction (Critical) - without auto-compact, context saturates after 30 minutes.
Instruction loss after compaction (Critical) - without CLAUDE.md, guidelines disappear.
Vague prompts (Warning) - an imprecise prompt triples token consumption.
Single session for everything (Warning) - refusing multi-sessions divides capacity by 3.
No monitoring (Warning) - 67% of users never check their consumption.
Mass copy-paste (Warning) - pasting 500 lines wastes 4,000 tokens unnecessarily.
Persisting in a saturated session (Warning) - beyond 85%, quality drops by 45%.
Unstructured CLAUDE.md (Minor) - a poorly organized file wastes 3,000 tokens per session.

For a complete view of best practices, consult the context management tips and the deep dive into internal workings.

Key takeaway: fix the 4 critical mistakes first - they alone account for 70% of context-related productivity losses.

Context Management - Common Mistakes

TL;DR

How does file overloading destroy your context?

Why does ignoring Plan mode cost you tokens?

What problems does the absence of automatic compaction cause?

How do vague prompts waste your window?

Why is not using multiple sessions a mistake?

What are the risks of not monitoring token consumption?

How to avoid losing system instructions after compaction?

Why is mass copy-pasting counterproductive?

Can you use Claude Code effectively without understanding token anatomy?

When should you start a new session rather than continuing?

How to structure a CLAUDE.md file to maximize context retention?

Is there a summary of mistakes ranked by severity?

This topic is covered in Module 4 of our Claude Code training