Playwright MCP is Microsoft's Model Context Protocol server for browser automation. It is $0: Apache-2.0 licensed, run locally with npx @playwright/mcp@latest, no API key, no hosted tier, no rate limits. The cost that actually matters is LLM tokens, and that cost is measurable: ~114,000 tokens per typical task, about $0.34 on Claude Sonnet 4.6. This page has the full math, all 23 tools, and every client config.
Is Playwright MCP Free? Pricing and Real Cost
Yes. @playwright/mcp (v0.0.76 on npm as of June 2026) is Apache-2.0 open source, maintained by Microsoft. It runs on your machine via npx, requires Node.js 18+, and has no paid plan, no usage metering, and no API rate limits. Searches for "playwright mcp pricing" have only one honest answer: the software is $0 and the bill is your LLM provider's token meter.
That bill is quantifiable. The Playwright team benchmarked a typical browser automation task at roughly 114,000 tokens through MCP versus 27,000 through the Playwright CLI (the snapshots-on-disk alternative they ship for filesystem-capable agents). Snapshots dominate the count, and snapshots are input tokens. At current per-token prices:
| Model | Input price | Per task via MCP | Per task via CLI | 1,000 tasks/mo via MCP |
|---|---|---|---|---|
| Claude Sonnet 4.6 | $3.00 / 1M tokens | ~$0.34 | ~$0.08 | ~$342 |
| Claude Haiku 4.5 | $1.00 / 1M tokens | ~$0.11 | ~$0.03 | ~$114 |
Two caveats on the math. First, it ignores output tokens (the agent's tool calls are short, typically a few hundred tokens per step) and prompt caching, which discounts repeated prefixes. Second, "typical task" is the Playwright team's benchmark workload; a 3-step smoke check costs far less and a 50-step crawl costs more, because each navigation appends a fresh accessibility snapshot to context.
The pricing summary
Playwright MCP: $0 software, ~$0.08 to ~$0.35 of tokens per task depending on model and transport. No subscription exists to compare against. If a vendor charges you for "Playwright MCP hosting", you are paying for their infrastructure, not the server.
mcp.json Config for Every Client
Every client points at the same command: npx @playwright/mcp@latest. No global install; browser binaries download on first use.
Cursor (~/.cursor/mcp.json)
~/.cursor/mcp.json
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["@playwright/mcp@latest"]
}
}
}Project-scoped: put the same block in .cursor/mcp.json inside the repo. Cursor also accepts it via Settings, then MCP, then Add new MCP Server.
Claude Code
One command
claude mcp add playwright npx @playwright/mcp@latestVS Code
CLI install
code --add-mcp '{"name":"playwright","command":"npx","args":["@playwright/mcp@latest"]}'OpenAI Codex (~/.codex/config.toml)
~/.codex/config.toml
[mcp_servers.playwright]
command = "npx"
args = ["@playwright/mcp@latest"]Windsurf and other MCP clients
Any client that speaks MCP over stdio takes the identical mcpServers JSON block from the Cursor example. In Windsurf, add it through the Cascade MCP server settings.
Pass flags as extra array entries, e.g. "args": ["@playwright/mcp@latest", "--headless", "--caps=vision,pdf"].
All 23 Core Tools (Reference Table)
The default tool set as of v0.0.76. Element-targeting tools take two inputs: element (a human-readable description) and ref (the exact element reference from the latest browser_snapshot). The model never guesses pixel coordinates in the default mode.
| Tool | What it does | Key inputs |
|---|---|---|
| browser_navigate | Navigate to a URL | url |
| browser_navigate_back | Go back to the previous page | none |
| browser_click | Click an element (single, double, right) | element, ref |
| browser_type | Type text into an editable element | element, ref, text |
| browser_fill_form | Fill multiple form fields in one call | fields[] |
| browser_select_option | Select dropdown option(s) | element, ref, values |
| browser_hover | Hover over an element | element, ref |
| browser_drag | Drag from one element to another | start + end element/ref |
| browser_drop | Drop onto a target element | element, ref |
| browser_press_key | Press a keyboard key | key |
| browser_handle_dialog | Accept or dismiss alert/confirm/prompt | accept, promptText |
| browser_file_upload | Upload file(s) to a file chooser | paths |
| browser_snapshot | Accessibility-tree snapshot of the page (the core observation tool) | none |
| browser_take_screenshot | Capture a screenshot (viewport, full page, or element) | optional element/ref, fullPage |
| browser_console_messages | Return browser console output | none |
| browser_network_requests | List network requests since page load | none |
| browser_network_request | Single-request counterpart to browser_network_requests | request |
| browser_evaluate | Run JavaScript on the page or an element | function, optional ref |
| browser_run_code_unsafe | Run arbitrary Playwright code (full API, unsandboxed) | code |
| browser_resize | Resize the browser window | width, height |
| browser_wait_for | Wait for text to appear/disappear or a delay | text | textGone | time |
| browser_tabs | List, open, select, or close tabs | action, index |
| browser_close | Close the page | none |
A typical tool sequence (login form)
// 1. Navigate
browser_navigate({ url: "https://app.example.com/login" })
// 2. The response includes an accessibility snapshot:
// - textbox "Email address" [ref=e3]
// - textbox "Password" [ref=e4]
// - button "Sign in" [ref=e5]
// 3. Act on refs from the snapshot
browser_fill_form({ fields: [
{ name: "Email address", type: "textbox", ref: "e3", value: "user@example.com" },
{ name: "Password", type: "textbox", ref: "e4", value: "hunter2" }
]})
browser_click({ element: "Sign in button", ref: "e5" })Opt-in Tools: the --caps Flag
Six capability packs stay out of context unless you enable them, which keeps the default tool-schema overhead small. Enable with --caps=vision,pdf style flags.
| Cap | Tools added | Use when |
|---|---|---|
| vision | browser_mouse_click_xy, browser_mouse_down, browser_mouse_drag_xy, browser_mouse_move_xy, browser_mouse_up, browser_mouse_wheel | Canvas/WebGL pages where the accessibility tree is incomplete |
| browser_pdf_save | Exporting pages as PDF | |
| devtools | browser_start_tracing, browser_start_video, browser_highlight, browser_annotate, and more | Performance traces, recordings, visual debugging |
| storage | browser_cookie_get/set/list/clear/delete, browser_localstorage_*, browser_sessionstorage_*, browser_storage_state | Auth state, cookie manipulation, session seeding |
| testing | browser_generate_locator, browser_verify_element_visible, browser_verify_list_visible, browser_verify_text_visible, browser_verify_value | E2E test generation with assertions |
| network | browser_route, browser_unroute, browser_route_list, browser_network_state_set | Mocking requests, offline simulation |
browser_navigate as an OpenAI Tool
MCP is client-agnostic, so the same 23 tools (including browser_navigate) surface as OpenAI function-calling tools when you attach the server to the OpenAI Agents SDK. The SDK spawns the server over stdio, lists its tools, and exposes each one to the model with its JSON schema:
OpenAI Agents SDK (Python) using Playwright MCP tools
from agents import Agent, Runner
from agents.mcp import MCPServerStdio
async def main():
async with MCPServerStdio(
params={"command": "npx", "args": ["@playwright/mcp@latest", "--headless"]}
) as playwright:
agent = Agent(
name="browser-agent",
instructions="Use the browser tools. Read the snapshot after each action.",
mcp_servers=[playwright],
)
result = await Runner.run(
agent, "Open https://example.com and report the page title."
)
print(result.final_output)The model then emits ordinary tool calls like browser_navigate({"url": "https://example.com"}) and receives the accessibility snapshot as the tool result. No OpenAI-specific wrapper exists or is needed; the tool names and schemas are identical across Claude, GPT, and any other function-calling model.
Snapshots vs Screenshots: the Token Mechanism
Browsers maintain an accessibility tree for screen readers: every button, link, and input with its role, label, and state. browser_snapshot serializes that tree to text. This is why Playwright MCP is cheap relative to vision-based automation:
| Factor | browser_snapshot (default) | browser_take_screenshot |
|---|---|---|
| Format | Structured text (a11y tree) | Image (PNG/JPEG) |
| Model requirement | Any text LLM | Vision-capable model |
| Element targeting | Deterministic refs (e3, e5...) | Pixel coordinates |
| Canvas / WebGL / custom rendering | May miss elements | Captures everything visible |
| Token growth | One snapshot appended per action/navigation | One image per capture |
The snapshot model has one failure mode worth planning for: context growth. Each navigation returns a fresh snapshot, and a content-heavy page produces a large one. By the late steps of a long session, the conversation is carrying snapshots of pages the agent already left. That accumulation is the entire 114K-token figure above, and it is also the argument for the Playwright CLI on long workflows (below) and for --caps=vision only when the tree genuinely cannot describe the page.
CLI Flags, Profiles, and CI
The server launches a headed Chromium by default. The flags you will actually use:
| Flag | Effect |
|---|---|
| --headless | No browser window; required for CI |
| --browser firefox|webkit|msedge | Switch engine from the Chromium default |
| --device "iPhone 15" | Emulate a mobile viewport |
| --caps=vision,pdf,... | Enable opt-in tool packs |
| --isolated | In-memory profile per session; nothing persists |
| --storage-state state.json | Seed cookies/auth into an isolated session |
| --user-data-dir | Pin the persistent profile location |
| --port / --host | Serve over HTTP instead of stdio |
Profiles: by default the server runs persistent, storing logged-in state in a profile keyed to your workspace and reusing it across sessions, so you log in once. --isolated flips to a throwaway in-memory profile, seeded via --storage-state if you need auth.
CI: --headless plus the default Chromium is the stable combination. The repo README also ships a Docker config using the mcr.microsoft.com/playwright/mcp image for containerized headless runs. There are no rate limits to manage since everything is local; budget tokens, not requests.
Playwright MCP vs Browser Use
These two get compared constantly and they are not substitutes. Playwright MCP gives your existing agent browser tools. browser-use is the agent.
| Factor | Playwright MCP | browser-use |
|---|---|---|
| What it is | MCP tool server for an external agent | Autonomous browser agent library with its own LLM loop |
| License | Apache-2.0 (Microsoft) | MIT (~98k GitHub stars) |
| Runtime | Node.js 18+, npx @playwright/mcp@latest | Python 3.11+, pip install "browser-use[core]" |
| Latest version | 0.0.76 | 0.13.1 |
| Who reasons | Your agent (Claude Code, Cursor, Codex, custom) | Its own loop: OpenAI, Anthropic, Google, Ollama, or its ChatBrowserUse / bu-30b-a3b models |
| Page representation | Accessibility-tree snapshots (text), vision opt-in | LLM-driven with vision support |
| Hosted offering | None; fully local | Browser Use Cloud: stealth browsers, proxy rotation, CAPTCHA solving (paid) |
| Cost | $0 + your LLM tokens | $0 OSS + LLM tokens, or paid cloud |
Decision rule: if you already run a coding agent and want it to verify UIs, generate tests, or scrape behind JavaScript, attach Playwright MCP; it is one config block and the agent you already pay for does the thinking. If you are shipping a standalone automation product (form-filling at scale, scraping with anti-bot pressure), browser-use plus its cloud handles stealth and CAPTCHAs, which Playwright MCP deliberately does not touch.
Playwright MCP vs Playwright CLI
The Playwright team ships a CLI alternative aimed at filesystem-capable agents, and their benchmark is the number everyone quotes: a typical task costs ~114,000 tokens via MCP and ~27,000 via CLI. The difference is where snapshots go. MCP streams each snapshot into the model's context as a tool result. The CLI writes snapshots to disk as YAML and returns a file path; the agent reads a file only when it needs it, so context stays flat at step 50.
| Scenario | Use |
|---|---|
| Claude Code / Cursor / Copilot (have shell + filesystem) | CLI for long sessions; MCP works but costs ~4x |
| Claude Desktop, chat UIs, sandboxed agents (no shell) | MCP; it is the only option |
| Custom agent over the MCP protocol (any model vendor) | MCP; native protocol, no shell wrapping |
| Short tasks (under ~10 steps) | Either; the token gap is small in absolute dollars |
The same context-budget logic applies beyond browsers: agents that stream every intermediate result into context pay for it on every subsequent step. It is why code-search tools like WarpGrep return ranked file spans instead of dumping whole files into the conversation.
Frequently Asked Questions
Is Playwright MCP free to use?
Yes. Apache-2.0 license, local execution, no paid tier, no API key, no rate limits. Total software cost is $0; you pay only your LLM provider for tokens, roughly $0.08 to $0.35 per typical task depending on model and transport (math above).
What is browser_navigate?
The Playwright MCP tool that loads a URL: browser_navigate({"url": "https://..."}). The tool result includes an accessibility snapshot of the loaded page, which is how the model learns what elements exist before its next action. It is exposed to OpenAI models as a standard function-calling tool via the Agents SDK (code above) and to Claude, Cursor, and Codex through their MCP integrations.
How do I add Playwright MCP to Cursor's mcp.json?
Put {"mcpServers": {"playwright": {"command": "npx", "args": ["@playwright/mcp@latest"]}}} in ~/.cursor/mcp.json (global) or .cursor/mcp.json (per project).
How many tools does Playwright MCP have?
23 in the core set, from browser_navigate to browser_tabs (full table above). Six opt-in packs behind --caps add coordinate mouse control, PDF export, devtools tracing, storage management, test assertions, and network routing.
Playwright MCP vs browser-use: which should I pick?
Playwright MCP if an agent you already use (Claude Code, Cursor, Codex) needs browser access: it is a $0 tool server, your agent keeps doing the reasoning. browser-use if you are building a standalone autonomous automation in Python, especially if you need its paid cloud's stealth browsers, proxies, and CAPTCHA solving. Full comparison table above.
Does Playwright MCP need a vision model?
No. The default browser_snapshot returns the accessibility tree as text, so text-only models work. Enable --caps=vision for coordinate-based mouse tools on canvas-heavy pages.
What browsers does it support, and can it run headless?
Chromium (default), Firefox, WebKit, and Edge via --browser. It launches headed by default; add --headless for CI, optionally with the official Docker image for containerized runs.
How do I keep token costs down?
Three levers: use the Playwright CLI instead of MCP when your agent has filesystem access (~4x fewer tokens per the Playwright team's benchmark), minimize navigations since each one appends a snapshot, and prefer browser_wait_for over re-navigating to poll for changes. On the model side, a cheaper model like Claude Haiku 4.5 cuts the same 114K-token task from ~$0.34 to ~$0.11.
Give Your Agent Eyes, Then Give It Fast Hands
Playwright MCP lets your agent see and click. Morph's tools make the rest of the loop fast: WarpGrep returns ranked code context without flooding the window, and Fast Apply merges edits at ~10,500 tok/s. Search, edit, verify in the browser, repeat.
