Claude Code Slow or Stuck? Here's What's Actually Happening

Claude Code running slow, hanging, or getting stuck mid-task? Most causes trace to context bloat, search overhead, or rate limiting. Diagnosis steps and fixes for each.

March 13, 2026 · 2 min read

Diagnose the Problem

Before fixing anything, figure out which kind of slow you're dealing with. They look similar but have different root causes.

SymptomLikely CauseQuick Fix
Responses get slower each turnContext bloat/compact or start fresh conversation
Long pause before response startsCodebase search grinding through filesUse WarpGrep MCP server
Responses slow after heavy usageRate limiting / throttlingCheck /status, wait for window reset
Agent stuck mid-task, no outputHung process or infinite loopPress Escape, try /compact
Everything slow, not just your sessionAnthropic server issuesCheck status.anthropic.com

Run /status first. If it shows you're near your usage limit, that's the answer. If usage is fine, check the statusline at the bottom of your terminal for context percentage. Above 70% means context bloat.

Context Bloat: The Progressive Slowdown

Claude re-reads the entire conversation history with every prompt. This is not optional. The model has no persistent memory between turns. Everything from turn 1 gets re-processed at turn 20.

~3s
Response time (fresh conversation)
~15s
Response time (turn 20)
100K+
Tokens accumulated by turn 15

What accumulates in context: your prompts, Claude's responses, every file Claude read, every tool call result, every search output. A single file read can add 5,000-10,000 tokens. By turn 15, you're carrying more context about past operations than about the current task.

Fixes

  • /compact compresses the conversation history, keeping key information while discarding verbose tool outputs and intermediate results. Use it when context hits 60-70%.
  • /clear starts a completely fresh conversation. Use it when switching to an unrelated task. Context from a CSS debugging session has no business in your auth refactor.
  • Lean CLAUDE.md files. Your CLAUDE.md is injected into every prompt. If it's 2,000 words of outdated instructions, that's 3,000+ tokens of overhead before you even type.
  • Batch related instructions. "Fix the bug AND update the test" as one prompt uses fewer total tokens than two separate prompts, because history isn't duplicated.

Search Overhead: The Hidden Time Sink

When you ask Claude Code to modify existing code, it first needs to find that code. The built-in approach is sequential file reading: grep for a pattern, read a file, check if it's the right one, repeat. Cognition measured this at 60% of total agent execution time.

The Search Tax

On a project with 500+ files, a single "find the authentication middleware" request can trigger 15-20 file reads. Each read adds 3-10 seconds and 5,000+ tokens to context. Total: 45-200 seconds and 75,000-200,000 tokens just to locate the code, before any actual work begins.

This compounds with context bloat. Every file Claude reads stays in the conversation. After two search-heavy operations, context is bloated with file contents that are no longer relevant.

Fix: Offload Search to WarpGrep

WarpGrep is an MCP server that replaces Claude Code's file-by-file search with a trained search model. It finds the right code in 3.8 steps on average, compared to 10-20 with native search.

3.8
Avg search steps (WarpGrep)
0.73
F1 score on search benchmarks
60%
Agent time on search (Cognition)

Because WarpGrep runs as a separate model, the search tokens don't count against your Claude usage. Less context consumed means faster responses and fewer rate limit issues.

Rate Limiting and Throttling

Anthropic applies two kinds of limits: hard cutoffs (you get an explicit "usage limit reached" message) and soft throttling (responses get progressively slower as you approach the limit). The throttling is harder to detect because there's no error message.

How to Check

  1. Run /status to see current usage vs. plan limits
  2. If you're above 80% of your 5-hour window, throttling is likely
  3. Check if Claude web usage is eating into your pool (they share limits)

Plan Limits

Plan5-Hour WindowWeekly Active Hours
Pro ($20/mo)~45 prompts40-80 Sonnet hrs
Max 5x ($100/mo)~225 prompts480 Sonnet / 40 Opus hrs
Max 20x ($200/mo)~800 prompts480 Sonnet / 40 Opus hrs

Fixes

  • Enable extra usage in Settings > Usage to continue at API rates after hitting limits
  • Switch to Sonnet for routine tasks. Opus consumes roughly 5x more active time per prompt.
  • Reduce token waste. Fewer search tokens = slower quota drain. WarpGrep helps here.
  • Avoid peak hours. Server-side throttling is worse during US business hours (9am-6pm PT).

Stuck Mid-Task: Agent Not Responding

Different from slow responses. The agent produces no output at all. Common scenarios:

Infinite Tool Loops

Claude calls the same tool repeatedly without making progress. Often happens during search when the agent can't find the right file and keeps trying variations of the same grep pattern. Press Escape to interrupt.

Extended Thinking Timeout

With extended thinking enabled, Claude can spend 60+ seconds reasoning before producing any visible output. This looks like a hang but isn't. Wait longer, or disable extended thinking for simpler tasks.

MCP Server Hang

A broken or unresponsive MCP server can block Claude Code entirely. If you recently added a new MCP server and things stopped working, disable it in settings and test.

Network Issues

Claude Code requires a persistent connection to Anthropic's API. Flaky WiFi, VPN timeouts, or corporate firewalls can cause silent failures. Check your network if other causes are ruled out.

Recovery Steps

  1. Press Escape to interrupt the current operation
  2. Run /compact to reduce context size
  3. If still stuck, run /clear and re-state your task
  4. As a last resort, close and reopen Claude Code

Server-Side Slowdowns

Sometimes the problem is on Anthropic's end. On March 11, 2026, Claude experienced a major outage affecting logins and response times. In February 2026, a prompt caching bug caused tokens to be consumed at 2-3x the normal rate.

How to Tell

  • Check status.anthropic.com for active incidents
  • Search Twitter/X for "claude down" to see if others are affected
  • If a fresh conversation with a one-line prompt is slow, the problem is server-side

During outages, having an extra usage budget set up won't help. Your only options are waiting for the incident to resolve or switching to a different tool temporarily.

Frequently Asked Questions

Why is Claude Code so slow?

Three causes: context bloat (conversation history grows with each turn), search overhead (reading files one by one to find code), or rate limiting (approaching your plan's usage ceiling). Check /status and your context percentage to determine which.

How do I fix Claude Code when it gets stuck?

Press Escape to interrupt, then run /compact to compress history. If still stuck, use /clear for a fresh start. Check that MCP servers aren't hanging by disabling recently added ones. For search-related hangs, WarpGrep replaces sequential file reading with semantic search that resolves in 3.8 steps.

Does Claude Code get slower the longer you use it?

Yes, within a single conversation. Each turn adds history that Claude re-processes with every prompt. A fresh conversation responds in ~3 seconds. By turn 20, the same prompt can take 15+ seconds. Start new conversations for unrelated tasks and use /compact regularly.

Is Claude Code down or is it just me?

Check status.anthropic.com for active incidents. If the status page is clear, run a fresh one-line prompt in a new conversation. If that's fast, the problem is local (context bloat or rate limiting). If it's slow too, you may be hitting server-side throttling during peak hours.

Related Articles

Cut 60% of the Search Time That Slows Claude Code Down

WarpGrep finds code in 3.8 steps instead of 15. Less search = less context bloat = faster responses. Installs as an MCP server in under a minute.