---
title: "Bringing FastApply Back to Cursor with MCP"
url: "https://www.morphllm.com/blog/cursor-mcps"
description: "How Morph MCP Server restores 10,500 tok/sec edits to Cursor and adds Warp-Grep for better context retrieval"
date: "2025-11-24"
author: "Tejas Bhakta"
---
# Bringing FastApply Back to Cursor with MCP

**TL;DR**: Cursor removed its native apply model. We built an MCP server that brings FastApply (10,500 tok/sec) and Warp-Grep back to Cursor. Result: ~35% faster end-to-end edits compared to search & replace, higher accuracy, lower token costs.

One config file. No workflow changes.

---

## The Problem

Cursor shipped with a fast apply model. Then they removed it.

Now every code edit goes through the main model—slow, expensive, prone to hallucination on large files. Without a specialized apply model, edits take longer and token costs spike when the model rewrites entire files for single-line changes.

**The core issue**: No specialized apply model means using a reasoning model for a merging task. It's like using GPT-4 to run `git merge`.

---

## The Decision: MCP + FastApply

Model Context Protocol lets you plug external tools into Cursor. We built an MCP server that:

1. **Intercepts file edits** before they hit Cursor's default flow
2. **Routes to FastApply** (our 10,500 tok/sec merge model)
3. **Returns merged code** in 1-3 seconds for most files

Plus: **Warp-Grep** sub-agent for semantic code search. Think Cognition's SweeGrip, but available to everyone.

**Why Warp-Grep matters**:
- Cursor's default grep misses context across file boundaries
- Warp-Grep uses a small planning model to execute multi-step searches
- Returns semantically ranked results, not just text matches

---

## Installation (60 Seconds)

Add to `~/.cursor/mcp.json`:

```json
{
  "mcpServers": {
    "filesystem-with-morph": {
      "command": "npx",
      "args": ["@morphllm/morphmcp"],
      "env": {
        "MORPH_API_KEY": "your-api-key-here",
        "ALL_TOOLS": "true"
      }
    }
  }
}
```

Restart Cursor. Done.

**Get your API key**: [morphllm.com/dashboard/api-keys](https://morphllm.com/dashboard/api-keys)

---

## Results: Measured Performance

Based on [our benchmarks](https://morphllm.com/benchmarks) comparing Morph vs search & replace across real codebases.

### End-to-End Performance

| Metric | Result |
|--------|--------|
| Apply latency | 1-3 seconds |
| End-to-end speed improvement | ~35% faster |
| Average speed improvement | ~46% vs search & replace |

### Why FastApply Wins

**Search & Replace** requires a separate tool call for each chunk being edited. Multiple edits = multiple round trips.

**FastApply** handles all edits to a file in a single call. The model describes what it wants to change, and FastApply merges everything at once.

This means fewer tool calls, less coordination overhead, and faster end-to-end completion.

### Warp-Grep Impact

Warp-Grep combines semantic search with ripgrep for better context retrieval.

**How it works**:
1. Parse query into search plan
2. Execute multi-step ripgrep commands
3. Rerank results by semantic relevance
4. Return top-k with surrounding context

This reduces irrelevant files loaded into context, helping agents find the right code faster.

---

## How FastApply Works Under the Hood

Unlike full file rewrites or brittle diffs, FastApply uses a specialized merge model.

**Input format**:
```typescript
{
  original_code: "def calculate(x):\n    return x * 2",
  edit_snippet: "def calculate(x):\n    # Add validation\n    // ... existing code ..."
}
```

**Output** (streamed at 10,500 tok/sec):
```python
def calculate(x):
    # Add validation
    if x < 0:
        raise ValueError("x must be positive")
    return x * 2
```

The model understands:
- `// ... existing code ...` markers mean "keep this section"
- Structural context (indentation, scope, imports)
- Variable renaming across the file

**Why this approach works**:
1. Smaller model trained specifically on merging
2. No reasoning overhead—just pattern matching + structural understanding
3. Speculative decoding using the original file as a prior

---

## Warp-Grep: Semantic Code Search

Traditional grep: regex match, return all lines.

Warp-Grep: sub-agent workflow that combines grep with semantic understanding.

**Example query**: "Where do we handle rate limiting?"

**Agent plan**:
```bash
1. rg "rate.?limit" --type ts
2. rg "RateLimiter|Throttle" --type ts  
3. Read top 5 files
4. Rank by: class definitions > usage > comments
5. Return top 3 with surrounding context
```

**Pricing**: $0.30 per 1M input tokens, $0.30 per 1M output tokens.

---

## Configuration Options

### Edit-Only Mode

Just want fast edits, no extra filesystem access?

```json
{
  "env": {
    "MORPH_API_KEY": "your-key",
    "ALL_TOOLS": "false"  // Only edit_file tool
  }
}
```

### Workspace Mode (Default)

Automatically detects project root by looking for `.git`, `package.json`, etc.

No need to hardcode paths. Works across all your projects.

### Per-Project Config

Lock MCP to specific directory:

```json
{
  "args": [
    "@morphllm/morphmcp",
    "/Users/you/specific-project/"
  ]
}
```

---

## Real-World Use Cases

**1. Multiple Edits Per File**

FastApply handles all edits to a file in one call. Search & replace needs a separate tool call for each chunk.

**2. Multi-File Context**

Warp-Grep fetches semantically relevant context in one shot instead of manual grep + copy/paste.

**3. Agent Workflows**

If you're building agents on Cursor: FastApply's higher accuracy means fewer retries, which speeds up end-to-end task completion.

**4. Consistent Edits**

Apply suggested changes without worrying about exact string matching. The model understands code structure.

---

## Comparison to Cursor Native Apply

| Feature | Cursor Native (removed) | Morph MCP |
|---------|------------------------|-----------|
| Throughput | ~1000 tok/sec | 10,500 tok/sec |
| Max file size | ~400 lines | 1500+ lines |
| API access | No | Yes |
| Semantic search | No | Warp-Grep |
| Cost | Bundled | $0.80/1M in, $1.20/1M out |

Cursor's original apply model was Llama-3-70B. We use a smaller model trained specifically on code merging with speculative decoding.

---

## Limitations

**What FastApply doesn't do**:
- Won't write your code from scratch (use Claude/GPT for that)
- No reasoning or planning (pure merge task)
- Requires `// ... existing code ...` markers in snippets

**Warp-Grep limitations**:
- Requires `ripgrep` installed locally (when using local provider)
- Best on structured languages (Python, TS, Go)
- Sub-agent calls add latency (~300ms overhead)

**Current max file size**: 1500 lines. Training on 2500-line context now.

---

## What You Get

Install Morph MCP:
- 10,500 tok/sec throughput
- 1-3 second apply latency
- ~35% faster end-to-end vs search & replace
- Warp-Grep semantic search
- Higher accuracy means fewer retries

One config file. Works across all Cursor projects.

[Get API key](https://morphllm.com/dashboard/api-keys) • [Full MCP docs](https://docs.morphllm.com/guides/mcp) • [Warp-Grep docs](https://docs.morphllm.com/sdk/components/warp-grep)

**Pricing**: $0.80/1M input, $1.20/1M output for FastApply. $0.30/1M tokens for Warp-Grep.

Free tier: 100k tokens/month to try it out.

---

## Why This Matters

The best coding tools separate reasoning from execution.

Cursor's main models (Claude, GPT-4) should think about what to change.

FastApply should merge those changes. Fast.

Warp-Grep should find context. Semantically.

This separation of concerns is how modern coding agents achieve better accuracy and speed.

Now it's available to everyone using Cursor.