Quick Verdict
Decision Matrix (March 2026)
- Choose Qwen Code if: You need multi-provider flexibility, JetBrains IDE support, or self-hosted models with full data privacy. 1,000 free requests/day via OAuth.
- Choose Claude Code if: You need the highest SWE-bench accuracy (80.8% vs 69.6%), parallel sub-agent teams, or deterministic multi-file refactoring.
- Choose Gemini CLI if: You need native Google Search grounding. Same free tier (1,000 req/day), same Apache 2.0 license, but locked to Gemini models.
Qwen Code is a production-grade terminal agent that exposes a defined set of tools to the LLM through OpenAI-compatible function calling. The architecture is a ReAct loop: the model receives tool schemas on every turn, requests tool executions in its response, and receives structured results back. This is the same pattern as Claude Code and Gemini CLI, but Qwen Code is the only one of the three that lets you swap the backing model at runtime to any OpenAI-compatible endpoint.
The underlying Qwen3-Coder-480B model was trained with long-horizon reinforcement learning on 20,000 parallel environments, each running full GitHub issue resolution tasks. The RL training specifically targets multi-turn tool use, which is why the SWE-bench score (69.6%) is meaningfully higher than models trained only on code generation.
Agent Loop Architecture
Qwen Code is a TypeScript/Node.js application structured around a layered ReAct loop. Three core classes handle the lifecycle:
- GeminiClient — manages session lifecycle and API connections
- GeminiChat — owns conversation history and compression logic
- Turn — processes individual LLM response streams, extracts
function_calls
The execution path per turn:
- User input enters through
InputPromptwith tab completion AppContainerprepares React context and UI stateGeminiClientsends the conversation history plus all registered tool schemas (getFunctionDeclarations()) to the model API- The model responds with text, one or more
function_calls, or both Turnextracts function calls from the streaming responseToolSchedulervalidates parameters with Zod schemas, checks approval mode, runs tools (potentially in parallel), and returnsToolResultobjects- Results feed back into
GeminiChatas tool response messages for the next turn
Tool results have two fields: llmContent (detailed output for the model, including exit codes and PIDs) and returnDisplay (formatted output for the terminal UI). This separation prevents the model's context from being polluted by display formatting.
Parallel Tool Execution
ToolScheduler supports running multiple tool calls in parallel when the model requests them simultaneously. Read-only tools (read_file, grep_search, glob) can run concurrently. Write and execute tools are serialized by default unless running in YOLO mode. This matters for performance on tasks like "read these 10 files and summarize each" — all 10 read_file calls execute in one batch.
Built-in Tool Registry
Every tool is a subclass of BaseDeclarativeTool, registered in ToolRegistry, and exported to the LLM via getFunctionDeclarations(). The LLM sees the same JSON Schema definitions you would see if you called /tools --verbose in a session.
| Tool Name | Kind | Parameters | Confirmation Required |
|---|---|---|---|
| read_file | Read | path (string, abs), offset (int?), limit (int?) | No |
| write_file | Write | file_path (string, abs), content (string) | Yes (diff shown) |
| edit | Edit | file_path, old_string, new_string, replace_all (bool?) | Yes (diff shown) |
| list_directory | Read | path (string), ignore (string[]?), respect_git_ignore (bool?) | No |
| glob | Read | pattern (string), path (string?) | No |
| grep_search | Read | pattern (string), path (string?), glob (string?), limit (int?) | No |
| run_shell_command | Execute | command (string), is_background (bool?), timeout_ms (int?) | Yes (DEFAULT mode) |
| web_fetch | Read | url (string), method (string?), headers (object?), body (string?) | No |
| web_search | Read | query (string) | No |
| save_memory | Write | operation (add|clear), content (string?) | No |
| todo_write | Other | operation (add|update|complete|delete|clear), task details | No |
| task | Other | description (string), agent_type (file-search|...) | No |
| skill | Other | operation (execute|list|describe), skill_name (string?) | Depends |
Tool Schema Format
Tools are exposed to the model as standard OpenAI function definitions. The getFunctionDeclarations() method on ToolRegistry returns an array sorted alphabetically. Here is the schema for edit:
edit tool — function declaration sent to LLM
{
"type": "function",
"function": {
"name": "edit",
"description": "Replaces text in a file. Requires old_string to match exactly once in the file unless replace_all is set to true.",
"parameters": {
"type": "object",
"properties": {
"file_path": {
"type": "string",
"description": "Absolute path to the file to modify."
},
"old_string": {
"type": "string",
"description": "Exact text to find. Must be unique in the file."
},
"new_string": {
"type": "string",
"description": "Text to replace old_string with."
},
"replace_all": {
"type": "boolean",
"description": "If true, replace all occurrences. Default: false."
}
},
"required": ["file_path", "old_string", "new_string"]
}
}
}run_shell_command tool — function declaration sent to LLM
{
"type": "function",
"function": {
"name": "run_shell_command",
"description": "Execute a shell command in a persistent session. Commands that modify the filesystem require user approval in DEFAULT mode.",
"parameters": {
"type": "object",
"properties": {
"command": {
"type": "string",
"description": "Shell command to execute."
},
"is_background": {
"type": "boolean",
"description": "If true, run asynchronously. Default: false."
},
"timeout_ms": {
"type": "number",
"description": "Maximum execution time in milliseconds."
}
},
"required": ["command"]
}
}
}read_many_files (@ command)
The @path syntax is a shortcut that triggers read_many_files, a specialized batch read tool. It injects directory contents into the prompt with git-aware filtering: files matching .gitignore and .qwenignore patterns are excluded. This is how you efficiently pull an entire module into context without manually listing files.
File Edit Format
Qwen Code uses a search-and-replace format for file edits, not unified diffs. The model calls the edit tool with old_string and new_string. The tool validates that old_string appears exactly once in the target file, then replaces it.
| Tool | Format | Use Case |
|---|---|---|
| edit | Search-and-replace (old_string → new_string) | Targeted changes to existing files |
| write_file | Full file content overwrite | Creating new files or complete rewrites |
The search-and-replace approach has a known failure mode: if the model specifies an old_string that matches zero or multiple locations, the edit fails. Qwen Code includes a multi-stage correction mechanism to handle this: if the initial match fails, the tool calls the model again with the file content and asks it to refine the old_string to be more specific. This loop runs up to a configured number of retries before giving up.
Example: model requests an edit
// Model output (function call in response stream)
{
"name": "edit",
"arguments": {
"file_path": "/home/user/project/src/api.ts",
"old_string": " const timeout = 5000;",
"new_string": " const timeout = 30000;"
}
}
// Tool result returned to model
{
"llmContent": "Successfully replaced 1 occurrence in /home/user/project/src/api.ts",
"returnDisplay": "✓ Edited src/api.ts (1 change)"
}Comparing Edit Formats Across Terminal Agents
All three major terminal agents use search-and-replace as their primary edit format:
- Qwen Code: edit tool with old_string/new_string + write_file for full overwrites
- Claude Code: str_replace_based_edit_tool with old_str/new_str (same pattern)
- Gemini CLI: replace_file_content with diff patches via the diff package
The practical difference is in the correction mechanism. Qwen Code's multi-stage fuzzy correction handles model hallucination better than a hard fail. Claude Code relies on the model generating precise strings from the start. For fast-apply use cases, Morph's API handles the edit translation layer regardless of which agent generates the instruction.
Context Management
QWEN.md Project Files
Project context loads from QWEN.md files in a hierarchy: global user (~/.qwen/QWEN.md), then project root, then subdirectories. Run /init inside a session to generate a starter QWEN.md from your current project structure. The context.fileName setting lets you override the filename.
Context Compression
When conversation history grows large, two mechanisms kick in:
- Manual:
/compresssummarizes the current chat history into a condensed form. Useful before starting a new sub-task that does not need full prior context. - Automatic:
model.chatCompression.contextPercentageThresholdtriggers compression when history exceeds a set percentage of the model's context window. Default is typically 70-80%.
Context Window Limits
Qwen Code determines the effective context window from built-in defaults per model name. If your provider's actual limit differs (for example, you're using an older API version or a quantized local model), override it with contextWindowSize in generationConfig:
Override context window in settings.json
{
"model": {
"name": "qwen3-coder-plus",
"generationConfig": {
"contextWindowSize": 131072,
"enableCacheControl": true,
"samplingParams": {
"temperature": 0.1,
"top_p": 0.95,
"max_tokens": 8192
}
}
}
}Token Caching
For API key authentication (not OAuth), Qwen Code activates prompt caching when the provider supports it. System instructions and stable QWEN.md context are cached, so repeated turns in the same session do not re-process the full system prompt. Set enableCacheControl: true in generationConfig to enable. DashScope and OpenRouter support cache pricing for Qwen models.
| Context Feature | Qwen Code | Claude Code | Gemini CLI |
|---|---|---|---|
| Project memory file | QWEN.md (configurable) | CLAUDE.md | GEMINI.md |
| Manual compression | /compress command | Automatic compaction | /compress command |
| Context window override | contextWindowSize in generationConfig | Not exposed | Not exposed |
| Native context (Qwen3-Coder-480B) | 256K tokens | 200K (Opus 4.6) | 1M (Gemini 2.5 Pro) |
| Extended context | 1M via YaRN extrapolation | N/A | 1M (native) |
| Token caching | Yes (API key auth) | Yes (included) | Yes (Gemini API) |
| Session persistence | /chat save/resume | Cross-session memory | Limited |
MCP Integration
MCP (Model Context Protocol) servers register additional tools into the ToolRegistry at startup. From the model's perspective, MCP tools are indistinguishable from built-in tools — they appear in the same getFunctionDeclarations() array.
Discovery Process
On startup, discoverMcpTools() iterates the mcpServers config, establishes transport connections, calls each server's tool listing endpoint, sanitizes schemas (removes $schema and additionalProperties for compatibility), and registers tools. Name conflicts use first-registration-wins with prefixing: if two servers both expose a tool named search, the second becomes serverName__search.
Transport Types
- Stdio: Spawns a subprocess, communicates via stdin/stdout. Best for local Python or Node.js servers.
- SSE: Connects to a Server-Sent Events endpoint. For persistent remote servers.
- Streamable HTTP: HTTP streaming. For cloud-hosted MCP services.
MCP server configuration examples (settings.json)
{
"mcpServers": {
"warpgrep": {
"command": "npx",
"args": ["-y", "@morphllm/warpgrep-mcp"],
"env": {
"WARPGREP_API_KEY": "$WARPGREP_API_KEY"
},
"timeout": 10000
},
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/workspace"],
"trust": false
},
"remote-tools": {
"httpUrl": "https://my-mcp-server.example.com/mcp",
"headers": {
"Authorization": "Bearer $REMOTE_MCP_TOKEN"
},
"timeout": 5000,
"includeTools": ["search", "fetch"],
"excludeTools": ["dangerous_tool"]
},
"google-drive": {
"url": "https://mcp.googleapis.com/sse",
"authProviderType": "google_credentials"
}
}
}Tool Wrapping
Each discovered MCP tool is wrapped in a DiscoveredMCPTool instance. This wrapper handles:
- Confirmation dialogs (bypassed when
trust: trueon the server config) - Per-tool include/exclude filtering via
includeToolsandexcludeTools - Execution routing: the tool JSON parameters are passed to the MCP server via stdin (Stdio) or POST (HTTP)
- Response normalization: MCP responses are mapped to the standard
ToolResultformat
OAuth for Remote MCP Servers
Remote SSE/HTTP servers that require authentication trigger a browser-based OAuth flow. The token is stored in ~/.qwen/mcp-oauth-tokens.json. Three provider types are supported: dynamic_discovery (server advertises its OAuth config), google_credentials (Application Default Credentials), and service_account_impersonation (Google Cloud service accounts).
Multi-Provider Configuration
Qwen Code is provider-agnostic. The modelProviders array in ~/.qwen/settings.json defines available models. Switch at runtime with /model. Each provider entry specifies a protocol, base URL, model ID, and environment variable for the API key.
Multi-provider settings.json — full example
{
"modelProviders": [
{
"protocol": "openai",
"id": "qwen3-coder-plus",
"displayName": "Qwen3-Coder (DashScope)",
"baseURL": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"envKey": "QWEN_API_KEY"
},
{
"protocol": "openai",
"id": "qwen3-coder-480b-a35b-instruct",
"displayName": "Qwen3-Coder-480B (OpenRouter)",
"baseURL": "https://openrouter.ai/api/v1",
"envKey": "OPENROUTER_API_KEY",
"generationConfig": {
"contextWindowSize": 262144
}
},
{
"protocol": "anthropic",
"id": "claude-sonnet-4-5-20251022",
"displayName": "Claude Sonnet 4.5",
"envKey": "ANTHROPIC_API_KEY"
},
{
"protocol": "openai",
"id": "gpt-5.4",
"displayName": "GPT-5.4 (OpenAI)",
"envKey": "OPENAI_API_KEY"
},
{
"protocol": "openai",
"id": "qwen3-coder",
"displayName": "Qwen3-Coder (Ollama local)",
"baseURL": "http://localhost:11434/v1",
"envKey": ""
}
],
"model": {
"name": "qwen3-coder-plus"
}
}Environment setup
# DashScope (Alibaba Cloud — direct Qwen API)
export QWEN_API_KEY=sk-...
# OpenRouter (multi-model routing, same endpoint)
export OPENROUTER_API_KEY=sk-or-...
# Anthropic (for Claude models via Qwen Code)
export ANTHROPIC_API_KEY=sk-ant-...
# OpenAI (for GPT-5.4)
export OPENAI_API_KEY=sk-...
# Tavily (web_search tool — required for web search)
export TAVILY_API_KEY=tvly-...
# Launch with a specific model
qwen --model qwen3-coder-480b-a35b-instructSwitching providers mid-session with /model creates a new ContentGenerator instance for the selected provider but preserves conversation history. The model receives the full prior context on the first turn after switching.
Permission and Sandbox Model
Four Approval Modes
Every tool call routes through the approval system before execution. Four modes control the behavior:
| Mode | Flag | Read Tools | Edit/Write Tools | Shell Commands |
|---|---|---|---|---|
| PLAN | --approval-mode plan | Auto | Blocked | Blocked |
| DEFAULT | --approval-mode default | Auto | Requires approval | Requires approval |
| AUTO-EDIT | --approval-mode auto-edit | Auto | Auto | Requires approval |
| YOLO | --approval-mode yolo | Auto | Auto | Auto |
Per-tool overrides are available through tools.allowed (whitelist specific tools that bypass confirmation regardless of mode) and tools.exclude (remove tools from the registry entirely).
Sandboxing
The --sandbox flag enables container isolation for run_shell_command. Three backends:
- Docker/Podman (Linux): Uses the
qwen-code-sandboximage. Drop in a custom.qwen/sandbox.Dockerfilefor project-specific toolchains. - Seatbelt (macOS):
sandbox-execprofiles restrict filesystem and network access. Configured viaSEATBELT_PROFILE. - No sandbox: Default. All shell commands run in the host environment with workspace boundary enforcement only.
Workspace Boundary Enforcement
Independent of approval mode, all file tools (read_file, write_file, edit) validate that paths are absolute and within the registered workspace directories. Commands cannot access files outside the project root. The /directory add slash command extends the workspace to additional directories.
settings.json — permission configuration
{
"tools": {
"approvalMode": "auto-edit",
"allowed": ["read_file", "grep_search", "glob", "list_directory"],
"exclude": ["web_search"],
"sandbox": false,
"shell": {
"enableInteractiveShell": true
},
"enableToolOutputTruncation": true,
"truncateToolOutputThreshold": 50000,
"truncateToolOutputLines": 500
}
}Setup and Installation
Requires Node.js 20+. Install takes under a minute.
Install
# npm (recommended)
npm install -g @qwen-code/qwen-code@latest
# Homebrew (macOS/Linux)
brew install qwen-code
# Official install script
curl -fsSL https://qwen-code-assets.oss-cn-hangzhou.aliyuncs.com/installation/install-qwen.sh | bash
# Verify
qwen --versionQuick start with Qwen OAuth (free, no API key)
# Launch — select "Qwen OAuth" on first run
qwen
# Headless mode (CI, scripts)
qwen -p "Add error handling to src/api.ts"
# Pipe mode
git diff HEAD~1 | qwen -p "Summarize these changes"
tail -f app.log | qwen -p "Alert me when you see an error"
# Launch in auto-edit mode (no confirmation for file edits)
qwen --approval-mode auto-edit
# Launch with sandboxing enabled
qwen --sandboxProject initialization
# In your project root, generate QWEN.md
qwen
/init
# QWEN.md structure (customize for your project)
# ---
# Project: my-api
# Stack: TypeScript, Node.js 20, PostgreSQL
# Test command: bun test
# Lint command: bun run lint
# Build command: bun run build
# Key directories: src/api, src/db, tests/
# ---Qwen Code vs Claude Code vs Gemini CLI
| Capability | Qwen Code | Claude Code | Gemini CLI |
|---|---|---|---|
| License | Apache 2.0 | Proprietary | Apache 2.0 |
| Free tier | 1,000 req/day (OAuth) | None | 1,000 req/day (Google account) |
| SWE-bench Verified | 69.6% (Qwen3-Coder-480B) | 80.8% (Opus 4.6) | 63.8% (Gemini 2.5 Pro) |
| Native context window | 256K tokens | 200K tokens (Opus 4.6) | 1M tokens |
| Extended context | 1M via YaRN | N/A | 1M (native) |
| File edit format | Search-replace (old_string/new_string) | Search-replace (old_str/new_str) | Unified diff patches |
| Multi-provider support | Any OpenAI-compatible endpoint | Anthropic only | Google only |
| Self-hosted models | Yes (Ollama, vLLM) | No | No |
| Parallel tool execution | Yes (ToolScheduler) | Yes | Yes |
| Parallel sub-agents | Single task tool | Yes (agent teams) | No |
| MCP support | Yes (all 3 transports) | Yes | Yes |
| Google Search grounding | No (Tavily API for web_search) | No | Built-in native |
| IDE integration | VS Code (beta), JetBrains | VS Code | VS Code, Zed |
| Sandboxing | Docker/Podman/macOS Seatbelt | macOS Seatbelt | No native sandboxing |
| Context compression | /compress + auto threshold | Automatic compaction | /compress command |
| Token caching | Yes (enableCacheControl) | Yes (included) | Yes (Gemini API) |
Model Support and Token Costs
Qwen Code works with any OpenAI-compatible provider. These are the models and costs most relevant to Qwen Code deployments as of March 2026:
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context | Provider |
|---|---|---|---|---|
| Qwen3-Coder-480B (free tier) | $0 (OAuth, 1K req/day) | $0 | 256K | DashScope OAuth |
| Qwen3-Coder-480B (DashScope API) | $0.22 | $1.00 | 256K | DashScope |
| Qwen3-Coder-Plus (DashScope) | $0.30 | $1.50 | 256K | DashScope |
| Qwen3-Coder-480B (OpenRouter) | ~$0.22 | ~$1.00 | 256K | OpenRouter |
| Claude Sonnet 4.5 | $3.00 | $15.00 | 200K | Anthropic direct |
| Claude Opus 4.6 | $15.00 | $75.00 | 200K | Anthropic direct |
| GPT-5.4 | $2.50 | $10.00 | 128K | OpenAI direct |
| Qwen3-Coder (Ollama local) | $0 (hardware cost) | $0 | Up to 256K | Self-hosted |
Cost comparison: 100K tokens of context
For a session with 100K input tokens and 5K output tokens:
- Qwen3-Coder-480B (free OAuth): $0 (counts as 1 request)
- Qwen3-Coder-480B (DashScope): $0.022 input + $0.005 output = $0.027
- Claude Sonnet 4.5: $0.30 input + $0.075 output = $0.375
- Claude Opus 4.6: $1.50 input + $0.375 output = $1.875
For heavy API usage, Qwen3-Coder is 10-70x cheaper than Claude models at the same context length. The tradeoff is the 11-point SWE-bench gap.
Benchmarks
Qwen3-Coder-480B was trained with long-horizon RL on 20,000 parallel environments. The RL loop specifically targets SWE-bench-style tasks: given a GitHub issue and a repository, resolve the issue through multi-turn tool use. This training approach directly optimizes for the benchmark, which is why the score (69.6%) is meaningful rather than an artifact of overfitting.
| Benchmark | Qwen3-Coder-480B | Claude Opus 4.6 | Gemini 2.5 Pro |
|---|---|---|---|
| SWE-bench Verified | 69.6% | 80.8% | 63.8% |
| LiveCodeBench v5 | Competitive (exact score TBD) | N/A | 70.4% |
| BFCL (function calling) | Strong (RL-trained) | N/A | N/A |
| CodeForces ELO | Reported, specific score varies by run | N/A | N/A |
The 11-point gap between Qwen3-Coder-480B (69.6%) and Claude Opus 4.6 (80.8%) on SWE-bench is not uniform across task types. Simple single-file bugs: gap is small. Complex multi-file refactors requiring deep dependency understanding: gap is large. For a free model, 69.6% is a high floor. For production critical paths where failing 1 in 10 tasks has real cost, Claude Code's accuracy advantage justifies the subscription.
What Developers Are Saying
Hacker News — Free Tier Discussion
"The qwen coder CLI gives you 1,000 free requests per day to the qwen coder model... Roo Code in VS Code and Qwen Coder in LM Studio is a decent local-only combo."
It's FOSS — Switching from Claude Code
An experienced sysadmin who switched from Claude Code to Qwen Code reported it "turns intent into correct, reviewable shell commands without any fees or demanding blind trust." The switch was driven by cost, data privacy from local deployment, and sufficient capability for Linux/sysadmin workflows.
InfoWorld — Honest Assessment
"Qwen Code is good but not great." Testing found the tool excels at new code generation but struggles with complex debugging: it "switched to a much simpler (and wrong) algorithm and couldn't figure out how to fix its own mistake."
GitHub Issues — Known Issues
Issue #1924: "Useless compression and buggy contextWindowSize" — users report that /compress sometimes fails to meaningfully reduce context on large sessions, and the contextWindowSize override is not always respected by all providers. Issue #882: quota exhaustion shows misleading error messages in the free tier.
Limitations
Technical Limitations (March 2026)
- Complex multi-file debugging: Real-world performance diverges from benchmark scores on tasks requiring tight dependency tracking across 10+ files. The multi-stage edit correction helps but does not eliminate failures.
- OAuth unavailable in non-interactive environments: Free tier OAuth requires a browser. CI, SSH sessions, and containers must use API key authentication. Free tier does not extend to headless use.
- contextWindowSize bugs: Some providers do not respect the override, causing Qwen Code to miscalculate compression triggers. Workaround: set it explicitly and verify with /stats.
- No parallel sub-agents: The task tool delegates to a single subagent at a time. Claude Code's agent teams (multiple sub-agents with shared task lists running in parallel) have no equivalent. For large parallel refactoring jobs, this is a real gap.
- VS Code extension is beta: The Qwen Code Companion extension is newer than the Gemini CLI or Claude Code IDE integrations. Diff preview and real-time change display have reported rendering issues.
- Model content restrictions: Qwen models include content filtering that affects Chinese political topics. For most development tasks this is irrelevant, but it can surface unexpectedly in compliance or legal tech contexts.
- Web search requires Tavily API key: Unlike Gemini CLI's native Google Search grounding, web_search in Qwen Code requires a separate Tavily API key. The Tavily free tier covers 1,000 searches/month.
When to Use Qwen Code
JetBrains IDE Users
Qwen Code is the only major terminal agent with JetBrains companion support. IntelliJ, PyCharm, WebStorm users get native diff preview and real-time change display without leaving the IDE.
Self-Hosted Model Deployments
Point Qwen Code at a local Ollama or vLLM endpoint running Qwen3-Coder. All inference stays on-premise. Necessary for air-gapped environments or strict data residency requirements.
Multi-Model CI Workflows
Headless mode plus multi-provider config means you can run different models per task type in CI: cheap Qwen for static analysis, Claude for complex refactors, all via the same Qwen Code binary.
Cost-Sensitive Teams
Qwen3-Coder-480B costs $0.22/1M input tokens on DashScope. Claude Sonnet 4.5 costs $3.00/1M. For teams running thousands of agentic sessions per day, that 13x cost difference compounds fast.
Open-Source Auditability
Apache 2.0 license, 19.8K GitHub stars. You can read the ToolRegistry implementation, audit the permission system, fork and modify the tool schemas, or integrate qwen-code-core as a library in your own tooling.
Unix Pipeline Integration
git diff | qwen -p, tail -f app.log | qwen -p, any stdin pipe works. Composable with existing shell workflows. Run in cron jobs, GitHub Actions, or as part of custom Makefiles.
| Priority | Best Choice | Reason |
|---|---|---|
| Highest SWE-bench accuracy | Claude Code | 80.8% vs 69.6% — 11-point gap on hard tasks |
| JetBrains IDE | Qwen Code | Only major agent with JetBrains companion plugin |
| Self-hosted inference | Qwen Code | Ollama/vLLM endpoint support |
| Multi-model flexibility | Qwen Code | Any OpenAI-compatible endpoint at runtime |
| Google Search grounding | Gemini CLI | Native, no separate API key |
| Parallel sub-agents | Claude Code | Agent teams not available in Qwen Code |
| Free daily requests | Qwen Code or Gemini CLI | Both: 1,000 req/day free |
| Cheapest API calls | Qwen Code (DashScope) | $0.22/1M input vs $3+ for Claude |
| Sandboxing options | Qwen Code | Docker, Podman, macOS Seatbelt all supported |
Frequently Asked Questions
What file edit format does Qwen Code use?
Search-and-replace. The edit tool takes old_string (must match exactly once) and new_string. A multi-stage correction mechanism retries with fuzzy matching if the exact string is not found. For full file overwrites, write_file is used instead. Both tools show a diff before writing and require confirmation in DEFAULT mode.
What tools does the LLM actually see?
Thirteen built-in tools as OpenAI function definitions: read_file, write_file, edit, list_directory, glob, grep_search, run_shell_command, web_fetch, web_search, save_memory, todo_write, task, and skill. MCP servers add additional tools to the same registry. Run /tools --verbose to see the full JSON schemas for every registered tool.
How does Qwen Code manage context on large codebases?
Project context loads from hierarchical QWEN.md files. /compress summarizes history manually. chatCompression.contextPercentageThreshold triggers automatic compression. The @path syntax batch-reads files with git-aware filtering. For models with smaller effective context, override with contextWindowSize in generationConfig.
How does MCP integration work technically?
discoverMcpTools() runs at startup, connects to configured servers via Stdio/SSE/HTTP, fetches tool schemas, sanitizes them (removes $schema and additionalProperties), and registers tools in ToolRegistry. MCP tools are wrapped in DiscoveredMCPTool for confirmation logic. Tool name conflicts get server-prefix namespacing.
What approval modes exist and how do they differ?
PLAN: no modifications allowed. DEFAULT: approval required for all edit/write/exec tools. AUTO-EDIT: file edits auto-approved, shell commands need approval. YOLO: everything auto-approved. Set via --approval-mode flag or tools.approvalMode in settings.json. Individual tools can be permanently whitelisted with tools.allowed.
Can I run Qwen Code in CI without interactive auth?
Yes, but not with the free OAuth tier — OAuth requires a browser. Use a QWEN_API_KEY (DashScope) or any other provider API key. Set the key as a CI secret, configure it in settings.json or as an environment variable, and run with qwen -p "prompt" for headless execution.
How does Qwen3-Coder handle function calling differently from standard models?
Qwen3-Coder was specifically RL-trained on tool use tasks, which means it's more reliable than fine-tuned-only models at knowing when to call tools vs. when to respond directly. The model uses a Hermes-style tool calling format and was deployed with --tool-call-parser qwen3_coder on vLLM. It runs exclusively in non-thinking mode (no extended reasoning blocks), which keeps tool call latency low and outputs immediately deployable.
How does Qwen Code compare to Gemini CLI architecturally?
They share the same base architecture (Qwen Code is a fork of Gemini CLI). Divergences: Qwen Code adds enhanced parser support for Qwen models, broader provider backend support (Anthropic, Google, OpenAI all in one config), JetBrains integration, and Docker/Podman sandboxing. Gemini CLI adds native Google Search grounding, which Qwen Code replaces with the Tavily-powered web_search tool. The agent loop, tool dispatch, and MCP integration patterns are nearly identical.
Related
Add Semantic Search to Qwen Code via MCP
WarpGrep is a semantic codebase search MCP server. Add it to mcpServers in your Qwen Code settings.json and the agent finds the right files on the first tool call instead of glob-and-grep cycles.
Sources
- Qwen Code GitHub Repository
- Qwen Code File System Tools Documentation
- MCP Servers with Qwen Code
- Qwen Code Configuration Reference
- Core Tools Reference (DeepWiki)
- Qwen3-Coder: Agentic Coding in the World (Qwen Blog)
- I Ditched Claude Code and Now Using Open Source Qwen AI (It's FOSS)
- Qwen Code is Good But Not Great (InfoWorld)
- Qwen Code Launch Discussion (Hacker News)
- Qwen3 Coder vs Kimi K2 vs Claude 4 Sonnet (Composio)