You're 40 minutes into a debugging session. Claude Code has the stack trace, the failing test, three modified files, and a plan for the fix. Then the context window fills up. Auto-compact fires, summarizes everything into "investigated authentication bug and modified middleware files," and the agent starts over from scratch. The /compact command exists to prevent exactly this scenario: you control when compaction happens and what gets preserved.
What Is /compact in Claude Code?
The /compact command manually triggers context compaction in Claude Code. It summarizes your conversation history, clears old tool outputs (file reads, grep results, bash output), and restarts the session from a compressed state. The result: you free up token space and keep working without losing the thread of your task.
The process works in three steps:
- Tool outputs cleared. Old file reads, grep results, and command outputs are removed or truncated. These are typically the largest token consumers in a long session.
- Conversation summarized. The full chat history gets condensed into a structured summary covering what was completed, what's in progress, and which files were modified.
- Session continues. The agent picks up from the compressed state with a fresh token budget and your CLAUDE.md files re-injected from disk.
Basic /compact usage
# Run compact with no arguments (generic summary)
/compact
# Run compact with preservation instructions
/compact preserve all file paths, the auth middleware changes, and current test failures
# Run compact focused on specific work
/compact focus on the API refactoring in src/routes/ and the migration planThe key difference from auto-compact is timing. Auto-compact fires when the context window hits ~75-80% capacity, with no awareness of your task state. /compact lets you choose the moment: after finishing a feature, after fixing a bug, or at any logical breakpoint where you can clearly articulate what needs to be preserved.
CLAUDE.md survives compaction
CLAUDE.md files are re-read from disk and re-injected fresh after every compaction. Instructions in CLAUDE.md are never lost. Instructions given only in conversation can be lost during compaction. Put anything the agent must always remember into CLAUDE.md. See our Claude Code best practices guide for structuring effective CLAUDE.md files.
/compact vs /clear: When to Use Which
Both commands free context space, but they work differently and serve different purposes. Choosing the wrong one wastes tokens or destroys useful context.
| Dimension | /compact | /clear |
|---|---|---|
| What it does | Summarizes conversation, preserves compressed state | Wipes conversation entirely, fresh start |
| Memory preserved | Summary of completed work, modified files, current plan | Nothing. CLAUDE.md reloads, but conversation is gone |
| When to use | Mid-task, at logical breakpoints | Switching to a completely different task |
| Aliases | None | /reset, /new |
| CLAUDE.md | Re-injected from disk | Re-injected from disk |
| Token recovery | Partial: summary still consumes tokens | Complete: full context window available |
| Risk | Lossy summary may miss details | No risk if task is done; total loss if task is in progress |
| Command history | Preserved | Reset |
Use /compact when:
- You're in the middle of a task and need more context space
- The session has accumulated noise from debugging, file reads, or exploration
- You've just finished one sub-task and are moving to the next within the same project
- You can clearly articulate what needs to be preserved
Use /clear when:
- You're done with the current task and starting something unrelated
- The context is so polluted that even a summary would carry noise
- You want the full ~114K usable tokens instead of a summary taking up 3K-5K
- You're switching from debugging to feature work, or between different codebases
The common mistake
Many users run /clear when they should run /compact. If you're mid-task and run /clear, you lose all context about what was done, what was tried, and what failed. The agent starts from zero and may repeat the same investigation you already did. Use /compact preserve [key details] instead, then use /clear only when you genuinely need a blank slate.
Customizing the Compact Prompt
The compact prompt is the instructions you pass to /compact that control what Claude preserves during summarization. Without instructions, Claude produces a generic summary. With specific instructions, you get a summary tailored to your current work.
Inline compact instructions
Pass instructions directly after the /compact command. These apply only to this compaction.
Effective compact prompts
# Preserve debugging context
/compact preserve the stack trace from the auth error, all modified files in src/middleware/, and the hypothesis that the token refresh is racing with the session check
# Preserve refactoring progress
/compact preserve the list of files migrated from REST to GraphQL, the remaining files to migrate, and the schema changes in src/graphql/
# Preserve test state
/compact preserve which tests are passing, which are failing, the exact error messages from the 3 failing tests, and the fix attempted in user-service.tsPersistent compact instructions in CLAUDE.md
For rules that should apply to every compaction (including auto-compact), add a "Compact Instructions" section to your CLAUDE.md file.
CLAUDE.md compact instructions
# Compact Instructions
When compacting, always preserve:
- All modified file paths with the specific changes made
- Current test results (which pass, which fail, exact error messages)
- The current task and remaining TODO items
- Any architectural decisions made and their reasoning
- Active debugging hypothesesThese instructions are loaded into context at session start and survive compaction themselves (CLAUDE.md is re-read from disk after every compact). This means your compact preferences persist across the entire session lifecycle.
What makes a good compact prompt
Effective compact prompts share three properties:
- Specificity. "Preserve the file paths" is better than "remember what we did." "Preserve the exact error message from line 47 of auth.ts" is better than "preserve the errors."
- Task relevance. Preserve information you'll need in the next 10-20 interactions. Drop information from completed sub-tasks that won't be referenced again.
- Structured format. Use categories (files modified, tests failing, current plan) so the summary is scannable. A bulleted summary is more useful than a paragraph.
Why generic /compact fails
Running /compact with no instructions produces summaries like "investigated authentication bug, modified middleware files, ran tests." This tells the agent almost nothing actionable. It does not know which middleware files, what the bug was, or which tests failed. The agent has to re-read files and re-run tests to recover, consuming the tokens you just freed.
Micro Compact in Claude Code
Micro compact is a lighter form of context management that Claude Code runs before the full auto-compact threshold. Instead of summarizing the entire conversation, micro compact selectively clears older tool outputs while preserving the conversation flow.
How micro compact works
When context usage reaches approximately 60-70% of the window, Claude Code may run a micro compact pass. This targets the largest token consumers first:
- Old file read outputs that have already been processed
- Grep/search results from earlier in the session
- Bash command outputs that are no longer relevant
- Verbose error logs from resolved issues
The conversation messages themselves remain intact. Your instructions, Claude's reasoning, and the task flow are preserved. Only the raw data outputs get cleared.
Micro compact vs full compact
| Dimension | Micro Compact | Full /compact |
|---|---|---|
| Trigger | Automatic at ~60-70% context | Manual or auto at ~75-80% |
| What gets cleared | Old tool outputs only | Tool outputs + conversation summary |
| Conversation preserved | Fully intact | Summarized |
| Token recovery | Moderate (10-30K) | Large (50-150K) |
| Information loss | Minimal (raw data only) | Significant (details lost in summary) |
| User control | Automatic only | Manual or automatic |
Micro compact is a safety valve. It gives you more runway before full compaction is needed. But if your workflow is generating large amounts of tool output (reading many files, running verbose test suites, processing logs), micro compact can only delay the inevitable. The real fix is reducing how many tokens each operation consumes in the first place.
Why Compaction Is Needed: The Context Waste Problem
The 200K token window sounds large. In practice, 86K tokens are consumed before you type your first message.
| Component | Typical tokens | Notes |
|---|---|---|
| System prompt + tools | ~20,000 | Fixed cost, always present |
| MCP tool schemas | 900-51,000 | More MCP servers = faster compaction |
| CLAUDE.md files | 300-2,000 | Survives compaction, loads every request |
| Auto-compact buffer | ~33,000 | Reserved, not usable |
| Conversation + outputs | ~114,000 | This is what fills up and triggers compaction |
Of the ~114K tokens available for work, two operations consume the most:
File reads: 2K-5K tokens each
Claude reads entire files to find the 10-20 lines it needs. A debugging session that explores 15 files consumes 30K-75K tokens on file contents alone. Cognition measured that agents spend 60% of their time just searching for code.
File rewrites: 200-5K tokens each
The default edit pattern rewrites entire files. Changing 3 lines in a 200-line file outputs all 200 lines, consuming tokens for 197 lines that did not change. A refactoring task touching 15 files can generate 30K tokens of unnecessary file rewrites.
These two operations create a feedback loop. The agent reads files to find code, edits files by rewriting them, then reads the same files again to verify the changes. Each cycle pushes the context closer to compaction. After compaction, the agent re-reads the files because the summary does not contain the actual code. This triggers more token consumption, which triggers another compaction. Repeat.
The compaction feedback loop
Sessions that compact once tend to compact repeatedly. Each compaction forces the agent to re-read files to recover lost context, which consumes more tokens, which accelerates the next compaction. This is why context rot gets worse over time, not better. The solution is not better compaction but less waste per operation.
The Limitations of /compact
Even with perfect compact prompts, /compact is fundamentally lossy. Summarization cannot preserve everything. Understanding the specific failure modes helps you work around them.
Exact values lost
Error messages, line numbers, variable values, stack trace details. The summary says 'fixed the null pointer in auth' but not that it was on line 147 of middleware.ts where req.user.id was undefined because the session expired.
Decision reasoning lost
The agent chose approach A over approach B based on 5 minutes of analysis. After compaction, it only knows it chose A. If it encounters a problem with A, it has no record of why B was rejected and may switch to B, undoing progress.
Multi-file relationships lost
The agent edited 4 files in a coordinated change. The summary mentions the files but not the specific coordination: the type definition in types.ts that must match the validator in validate.ts that must match the handler in routes.ts.
Partial progress lost
The agent was halfway through a 6-step plan. After compaction, it knows it 'started the migration' but not which steps are done and which remain. It may redo completed steps or skip remaining ones.
These limitations exist because summarization is fundamentally a compression operation. A 100K token conversation compressed to 5K tokens loses 95% of its information. Custom compact prompts help you choose which 5% to keep, but 95% is still gone.
The better approach is to generate fewer tokens in the first place. If your operations consume 3K tokens instead of 30K, you can work 10x longer before needing to compact at all. That is the premise behind FlashCompact.
FlashCompact: Prevent Context Waste Instead of Compressing It
/compact is a necessary tool, but it treats the symptom. The root cause is that standard Claude Code operations waste tokens. FlashCompact from Morph addresses the root cause: it reduces the tokens consumed per operation so compaction fires less often or not at all.
How FlashCompact works
FlashCompact is a set of three tools that attack the two biggest sources of context waste:
WarpGrep: Targeted Search
Returns only relevant code snippets with file paths and line numbers. One WarpGrep call replaces 5-10 file reads. 0.73 F1 score in 3.8 average steps on SWE-Bench. Tokens per search: 500-2K instead of 10K-40K.
Fast Apply: Compact Diffs
Outputs only changed lines with minimal surrounding context. A 3-line edit in a 200-line file produces ~20 tokens of diff instead of ~2,000 tokens of full rewrite. 10,500 tokens per second throughput.
Morph Compact: Verbatim Deletion
For deletions and large-scale removals, Morph Compact runs at 3,300+ tok/s with zero hallucination. No rewrite errors, no accidentally changed lines. Pure removal at the speed of the token stream.
The math: FlashCompact vs /compact
| Operation | Standard Claude Code | With FlashCompact |
|---|---|---|
| Find a function across codebase | 5-8 file reads (10K-40K tokens) | 1 WarpGrep call (500-2K tokens) |
| Edit 3 lines in a 200-line file | Full rewrite (2K tokens output) | Compact diff (20 tokens output) |
| Debug a failing test | 15+ file reads, multiple grep (50K+) | Targeted search + diff edits (5K-10K) |
| Refactor across 10 files | 10 full rewrites (20K-50K tokens) | 10 compact diffs (200-500 tokens) |
| Time to first compact | 15-30 minutes of active work | 1-3 hours of active work |
FlashCompact does not eliminate the need for /compact entirely. Very long sessions, large codebase explorations, and complex multi-file refactors will still eventually fill the context window. But by reducing the token cost per operation by 5-100x, FlashCompact pushes compaction from "every 15-30 minutes" to "once per multi-hour session," and in many cases, not at all. State-of-the-art results on SWE-Bench Pro confirm this: agents using these tools complete more tasks with fewer compaction cycles.
Extend Your Sessions with FlashCompact
WarpGrep and Fast Apply cut context waste from both reads and writes. Agents run 3-5x longer before /compact is needed. Drop-in MCP integration, no workflow changes.
How to Reduce Your Need for /compact
Beyond FlashCompact, several workflow changes reduce how often you need to compact.
1. One task per session
The context from task A is pure noise for task B. Run /clear between unrelated tasks instead of carrying forward a polluted context that triggers compaction sooner. Use /rename before clearing so you can /resume the session later if needed.
2. Use CLAUDE.md for persistent instructions
CLAUDE.md files load at the start of every session and survive every compaction cycle. Move your coding conventions, project structure, key file paths, and workflow rules from conversation into CLAUDE.md. This means the agent never needs to re-learn these after compaction, saving thousands of tokens per cycle. Keep it under 200 lines for best adherence.
3. Compact at logical breakpoints, not when forced
Run /compact proactively after finishing a sub-task, not reactively when the warning appears. A compact after "authentication is working and tests pass" produces a vastly better summary than a compact mid-debugging with 5 hypotheses in flight.
4. Delegate verbose operations to subagents
Each subagent gets its own isolated 200K context window. Delegate tasks that produce large outputs (running test suites, searching large codebases, processing log files) to subagents. Only the relevant result returns to your main session. Three parallel subagents give you an effective 600K tokens of working memory.
5. Reduce MCP server overhead
Each MCP server adds tool definitions to every request. Run /context to see the per-server cost. Disable servers you are not actively using. Prefer CLI tools (like gh, aws, gcloud) over MCP servers when possible because CLIs do not add persistent tool definitions.
6. Write specific prompts
Vague prompts like "improve this codebase" trigger broad scanning, reading dozens of files to figure out what you mean. Specific prompts like "add input validation to the login function in src/auth/login.ts" let the agent work with minimal file reads.
Prompt specificity comparison
# Vague (triggers 10-20 file reads, ~30K-80K tokens)
Fix the bug
# Specific (triggers 1-3 file reads, ~3K-10K tokens)
The checkout endpoint in src/api/checkout.ts returns 500 when
the card token is expired. The error is in the chargeCard()
function around line 84. Add a try/catch for the Stripe
TokenExpiredError and return a 402 with a descriptive message.Frequently Asked Questions
What does /compact do in Claude Code?
The /compact command summarizes your conversation history to free up context space in Claude Code's 200K token window. It compresses file reads, tool outputs, and messages into a structured summary, then restarts the session from that summary. You can pass optional instructions to control what gets preserved, like /compact preserve all file paths and the current debugging state.
What is the difference between /compact and /clear?
/compact summarizes the conversation and preserves a compressed version of your work so far. /clear wipes the conversation entirely and starts a completely fresh session. Use /compact when continuing the current task with less context. Use /clear when switching to a completely different task where the old context would be noise. /clear recovers more tokens (full window), but /compact preserves your progress.
How do I customize the compact prompt?
Two ways. For one-time compaction, pass instructions after the command: /compact preserve the auth middleware changes and failing test names. For persistent rules that apply to every compaction including auto-compact, add a "Compact Instructions" section to your CLAUDE.md file. These instructions survive compaction because CLAUDE.md is re-read from disk after every compact cycle.
What is micro compact?
Micro compact is a lighter compaction pass that fires before the full auto-compact threshold. It selectively clears older tool outputs (file reads, grep results, command output) while keeping conversation messages intact. It triggers around 60-70% context usage and typically frees 10-30K tokens, delaying or preventing full auto-compact.
Can I disable auto-compact?
No. Auto-compact is a built-in safety mechanism. You can delay it by running /compact manually at strategic points, reducing token waste per operation, and delegating verbose tasks to subagents. But the automatic trigger cannot be turned off. See our auto-compact deep dive for strategies to manage it.
Does /compact delete my code changes?
No. /compact only compresses conversation history in memory. All file changes, git commits, and code edits on disk are preserved. The risk is amnesia: the agent may forget what it changed and make contradictory edits because the summary lost those details. Use specific compact instructions to minimize this.
How do I know when to run /compact?
Run /context to see your current token usage. Compact manually when usage passes 60-70% and you're at a logical breakpoint (feature complete, bug fixed, sub-task done). Manual compaction at clean breakpoints produces much better summaries than auto-compact firing mid-task. You can also configure your status line to show context usage continuously.
Do CLAUDE.md instructions survive /compact?
Yes. CLAUDE.md files are re-read from disk after every compaction and re-injected into the session. Instructions in CLAUDE.md are never lost. Only instructions given in conversation can be lost during compaction. This is why you should put critical rules in CLAUDE.md rather than repeating them in chat.
What is the best alternative to relying on /compact?
FlashCompact prevents the context waste that makes compaction necessary. WarpGrep returns only relevant snippets instead of full files (0.73 F1 in 3.8 steps). Fast Apply uses compact diffs at 10,500 tok/s instead of full file rewrites. Together, they reduce per-operation token consumption by 5-100x, extending sessions from 15-30 minutes to 1-3 hours before compaction is needed.
How do compact instructions in CLAUDE.md work?
Add a heading called "Compact Instructions" to your CLAUDE.md with rules for what to preserve. For example: "When compacting, preserve all modified file paths, current test results, and remaining TODO items." These apply to both manual /compact and auto-compact. Claude reads the section when performing compaction and uses it to guide which information to keep in the summary.
Related Guides
Stop Losing Context. Start Preserving It.
FlashCompact reduces token waste from reads and writes so /compact fires less often. WarpGrep searches return only what matters. Fast Apply edits output only the diff. Drop-in MCP integration for Claude Code.