Context as a Finite Resource
Every technique in this guide stems from one reality: the context window is finite, and quality degrades as it fills. Understanding the mechanics of this constraint is essential for engineering effective agentic workflows.
How the Context Window Works
Section titled “How the Context Window Works”The context window holds everything the model can “see” during a single interaction:
- System instructions — agent configuration files, permissions, tool definitions
- Conversation history — all messages between you and the agent
- Tool outputs — file reads, command results, search results
- Agent state — current task context, accumulated decisions
For many modern AI coding agents, the context window is approximately 200,000 tokens. A single file read can consume 1,000-10,000 tokens. A debugging session exploring 20 files might consume 100,000+ tokens — half the window.
What Consumes Context (and How Much)
Section titled “What Consumes Context (and How Much)”| Activity | Typical Token Cost |
|---|---|
| System prompt + agent configuration file | 2,000-5,000 |
| Single file read (500 lines) | 3,000-6,000 |
git diff output | 500-5,000 |
| Test suite output | 1,000-10,000 |
| Build error log | 500-3,000 |
| Codebase exploration (10 files) | 30,000-60,000 |
| Long debugging session | 50,000-150,000 |
The biggest consumers are file reads during exploration and verbose tool outputs (build logs, test results, large diffs).
The Degradation Curve
Section titled “The Degradation Curve”Quality doesn’t degrade linearly — it follows a curve:
Quality ▲ │ ████████████████ │ ████████ │ ████ │ ████ │ ██ │ ██ │ █ └──────────────────────────────────────▶ Context Fill % 0% 25% 50% 75% 90% 100%Key thresholds:
- 50%: First signs of degradation in instruction following
- 75%: Noticeable recall issues for earlier context
- 90%: Significant quality loss, missed instructions
- 95%: Auto-compaction triggers (varies by AI coding agent)
Managing the Budget
Section titled “Managing the Budget”Strategy 1: Track Continuously
Section titled “Strategy 1: Track Continuously”Configure a status indicator to show context utilization — most AI coding agents expose this in their status bar or via a configuration option. This makes context usage always visible, like a fuel gauge.
Strategy 2: Clear Between Tasks
Section titled “Strategy 2: Clear Between Tasks”Clear/reset your context between unrelated tasks. A fresh context for a new task always outperforms a polluted context from an old one.
Strategy 3: Scope Investigations
Section titled “Strategy 3: Scope Investigations”Never ask your AI agent to “investigate” without scope boundaries:
# Bad — unbounded explorationLook into why the tests are failing.
# Good — scoped investigationCheck the auth middleware tests in src/__tests__/auth/.The test 'should refresh expired tokens' is failing.Look at the token refresh logic in src/middleware/auth.ts lines 45-80.Strategy 4: Use Sub-Agents for Heavy Reads
Section titled “Strategy 4: Use Sub-Agents for Heavy Reads”When you need to explore many files, delegate to sub-agents:
Use sub-agents to:1. Find all files that import from the auth module2. Check which ones handle session tokens3. Report back a summary of the token flowThe sub-agents consume their own context windows. Your main context receives only the summary.
Strategy 5: Compact with Intent
Section titled “Strategy 5: Compact with Intent”When you compact your context, tell your agent what to preserve:
Compact context. Preserve: modified file list, the API design we agreed on,test commands, and the remaining implementation steps.Discard: file exploration, build logs, superseded approaches.For tool-specific compaction commands and configuration, see the Tool Configuration Reference.
The Economics
Section titled “The Economics”Think of context as a budget:
| Spend | Cost (Tokens) | Value |
|---|---|---|
| Essential context (agent config file, task description) | 3,000 | High — always needed |
| Targeted file reads | 5,000-15,000 | High — directly relevant |
| Exploratory file reads | 20,000-60,000 | Medium — use sub-agents instead |
| Build/test output | 5,000-20,000 | Low-Medium — extract key info |
| Failed approaches | 10,000-30,000 | Negative — actively harmful |
The golden rule: Every token should earn its place. If a token doesn’t directly contribute to the current task, it’s reducing the quality of the tokens that do.
Key Takeaways
Section titled “Key Takeaways”- Context utilization directly impacts output quality — track it continuously
- The biggest context consumers are file reads and verbose tool outputs
- Clear between tasks, scope investigations, and delegate exploration to sub-agents
- Target 40-60% utilization for complex reasoning tasks
- Compact proactively with explicit preservation instructions
- Failed approaches are actively harmful — clear them out