Qwen CLI (Qwen Code): Install, Models, and How It Differs from Gemini CLI

Qwen Code (the Qwen CLI) is an open-source agentic coding agent that runs in your terminal. It is published on npm as @qwen-code/qwen-code and installed with npm install -g @qwen-code/qwen-code@latest. It was adapted from Google Gemini CLI v0.8.2 and tuned for Qwen3-Coder models. Qwen OAuth was discontinued on April 15, 2026, so it now needs an API key or an OpenAI-compatible endpoint. Morph serves Qwen models behind that same protocol.

@qwen-code/qwen-code

npm package name

Gemini CLI v0.8.2

Forked from

256K → 1M

Qwen3-Coder context (native to extrapolated)

2026-04-15

Qwen OAuth discontinued

What Qwen Code Is

Qwen Code is an open-source AI coding agent that lives in the terminal. It is published on npm as @qwen-code/qwen-code. After install, you invoke it with the qwen command. It is the same category of tool as Gemini CLI and Claude Code: a terminal agent that reads files, runs commands, and edits code in a loop.

Qwen describes it as a research-purpose CLI tool adapted from Gemini CLI, with an enhanced parser and tool support for Qwen-Coder models. The word "adapted" is precise. The project was originally based on Google Gemini CLI v0.8.2, then diverged.

Starting from Qwen Code v0.1, the team stopped syncing with upstream Gemini CLI and began independent development. So the two share a common ancestor (the Gemini CLI terminal UI and agent loop) but no longer track each other. New Qwen Code features do not come from Gemini CLI, and vice versa.

Built-in agentic features include Auto-Memory, Auto-Skills, SubAgents, Agent Teams, and MCP, plus built-in skills such as /review, /batch, /loop, and /bugfix. The SubAgents and Agent Teams features are the hierarchical-agent pattern: a main agent delegates focused subtasks to specialized subagents that report back.

Naming: Qwen CLI, Qwen Code, qwen-code

People search for this tool as "qwen cli", "qwen code", "qwen-code", and "qwen coder cli". They all refer to the same thing: the @qwen-code/qwen-code npm package, whose binary is qwen. There is no separate official tool called "qwen cli".

Install Qwen Code

Qwen Code installs from npm as a global package. The verified install command is a single line. You can also build it from source by cloning the repository if you want to track main or modify it.

Install Qwen Code from npm

# Install globally from npm
npm install -g @qwen-code/qwen-code@latest

# Verify the install
qwen --version

# Start the agent in your project directory
cd your-project
qwen

# Inside the CLI, configure a model provider
/auth

After install, the first thing to do is configure a provider with /auth. Since the free OAuth flow was discontinued, this is where you supply an API key or point the CLI at an endpoint. The next sections cover which models it targets and how access changed.

Which Models It Targets

Qwen Code is built to work with Qwen3-Coder models, using customized prompts and function-calling protocols to run Qwen3-Coder on agentic coding tasks. The prompts and parsing are tuned for this model family, which is the main reason to use Qwen Code over generic Gemini CLI.

The flagship target model is Qwen3-Coder-480B-A35B-Instruct. It is a 480B-parameter Mixture-of-Experts model with 35B active parameters, so it activates a fraction of its weights per token. It supports 256K tokens of context natively and up to 1M tokens with extrapolation.

480B

Total parameters (Qwen3-Coder flagship)

35B

Active parameters per token

256K

Native context window

Context with extrapolation

Qwen Code is not restricted to Qwen models. Supported model providers include OpenAI-, Anthropic-, Gemini-, and Qwen-compatible APIs, plus any third-party provider or local model via Ollama or vLLM. That makes the same CLI usable as a front end for almost any model you can reach over an OpenAI-compatible or provider-native protocol.

Access After the OAuth Discontinuation

Qwen OAuth, the free-quota login flow, was discontinued on April 15, 2026. Before that date, signing in with a Qwen account granted a free request quota. After it, that path is gone.

Access is now via API keys or coding-plan subscriptions rather than free OAuth. Documented options include the Alibaba Cloud Coding Plan, OpenRouter, and Fireworks AI. There is no documented free quota after the discontinuation date, so plan on a paid key or subscription.

No documented free quota after 2026-04-15

If a tutorial tells you to "just log in with Qwen OAuth for free," it predates April 15, 2026. The current path is to configure an API key or coding-plan subscription through the /auth command, or point the CLI at any OpenAI-compatible endpoint you already pay for.

The CLI itself stays open source and free to install. The cost is the model access, not the tool. This is the same split as Gemini CLI and Claude Code: the terminal agent is free, the tokens are not.

Qwen Code vs Gemini CLI vs Claude Code

All three are terminal coding agents with the same core loop: read context, call tools, edit files, repeat. They differ on what is open source, which model they default to, and the language they are written in. The table below uses only verified facts. Cells that cannot be verified from primary sources are left out rather than guessed.

Attribute	Qwen Code	Gemini CLI	Claude Code
Open source	Yes	Yes	No (proprietary CLI)
npm package	@qwen-code/qwen-code	Google Gemini CLI	Anthropic Claude Code
Default / target model	Qwen3-Coder	Google Gemini	Claude (Anthropic)
Lineage	Forked from Gemini CLI v0.8.2	Original (Google)	Original (Anthropic)
Other providers	OpenAI / Anthropic / Gemini / Qwen / Ollama / vLLM	Gemini-focused	Claude (configurable base URL)

Qwen Code and Gemini CLI are both open source and share a common ancestor, so their terminal experience is similar. The practical reason to choose Qwen Code is the Qwen3-Coder tuning: the enhanced parser and function-calling protocols built for that model family. Claude Code is the proprietary option in the set.

The provider question applies to all three

Each of these CLIs can be pointed at a model other than its default through an OpenAI-compatible base URL or a configured provider. For Claude Code specifically, see running a different LLM in Claude Code. For Qwen Code, the same idea is covered in the next section.

Using an OpenAI-Compatible Endpoint

Because Qwen Code supports OpenAI-compatible APIs, you can point it at any endpoint that implements /v1/chat/completions. The standard pattern is to set three environment variables: the base URL, the API key, and the model name. Then configure the provider with /auth inside the CLI.

Point Qwen Code at an OpenAI-compatible endpoint

# Set the OpenAI-compatible endpoint, key, and model as env vars.
# Replace the placeholders with your provider's values.
export OPENAI_BASE_URL="https://your-provider.example.com/v1"
export OPENAI_API_KEY="sk-your-key"
export OPENAI_MODEL="your-model-name"

# Start Qwen Code, then select the OpenAI-compatible provider
qwen
# inside the CLI:
/auth

The exact variable names depend on the Qwen Code version, so confirm them with qwen --help or the project README. The pattern itself is stable: base URL plus key plus model. Any model served behind an OpenAI-compatible gateway becomes usable from the same CLI.

This is the mechanism that lets you decouple the agent from the model. You keep the Qwen Code terminal experience and swap the backend: a hosted Qwen3-Coder, a local model via Ollama or vLLM, or a router that picks the model for you.

Running Qwen Models Through Morph

Morph serves Qwen models on its production fleet behind one OpenAI-compatible API at api.morphllm.com/v1. The fleet includes morph-qwen35-397b (Morph measured ~120 tok/s on its fleet) and morph-qwen36-27b, alongside other served models. Because the API is OpenAI-compatible, Qwen Code (or any OpenAI-shaped CLI) can use it by setting the base URL and a Morph API key.

This is the same decoupling pattern as above, applied to Morph. You point Qwen Code at https://api.morphllm.com/v1, supply a Morph API key, and pick a Qwen model name. The CLI stays the same; the backend is Morph's fleet.

Point an OpenAI-compatible CLI at Morph

# Run a Qwen-class model on Morph's fleet through any
# OpenAI-compatible CLI (Qwen Code, etc.)
export OPENAI_BASE_URL="https://api.morphllm.com/v1"
export OPENAI_API_KEY="sk-your-morph-key"
export OPENAI_MODEL="morph-qwen35-397b"

qwen
# then configure the OpenAI-compatible provider with /auth

Beyond serving the model, Morph adds a model router that classifies prompt difficulty in ~430ms and routes each request to the cheapest model that can handle it, for 40-70% API cost savings at about $0.001 per classification. So a coding session that sends most of its easy turns to a smaller model still keeps the frontier model for the hard ones, without changing the CLI.

One endpoint, many models

Pointing an OpenAI-compatible CLI at api.morphllm.com/v1 gives access to many models through a single API and key. See Qwen3-Coder for the model itself, the router for cost routing, and use a different LLM in Claude Code for the same pattern applied to Anthropic's CLI.

Frequently Asked Questions

What is Qwen Code (Qwen CLI)?

Qwen Code is an open-source AI coding agent that runs in the terminal, published on npm as @qwen-code/qwen-code. It is a research-purpose CLI adapted from Google Gemini CLI v0.8.2 with an enhanced parser and tool support tuned for Qwen3-Coder models. The command is qwen after install.

How do I install Qwen Code?

Install it globally from npm with npm install -g @qwen-code/qwen-code@latest, then verify with qwen --version. You can also build it from source by cloning the repository. After install, run qwen and configure a provider with the /auth command.

Is Qwen Code free?

Qwen OAuth, the free-quota login flow, was discontinued on April 15, 2026. The CLI is open source and free to install, but model access now requires an API key or a coding-plan subscription such as the Alibaba Cloud Coding Plan, OpenRouter, or Fireworks AI. There is no documented free quota after that date.

What models does Qwen Code use?

Qwen Code is built for Qwen3-Coder models. The flagship target is Qwen3-Coder-480B-A35B-Instruct, a 480B-parameter Mixture-of-Experts model with 35B active parameters, 256K tokens of native context, and up to 1M tokens with extrapolation. It can also drive OpenAI-, Anthropic-, Gemini-, and Qwen-compatible APIs, plus local models via Ollama or vLLM.

Qwen Code vs Gemini CLI: what is the difference?

Qwen Code was originally based on Google Gemini CLI v0.8.2, then forked into independent development from v0.1 onward. Qwen Code adds an enhanced parser and function-calling protocols tuned for Qwen3-Coder models, while Gemini CLI targets Google Gemini. Both are open source and run in the terminal.

Can Qwen Code use other models via an OpenAI-compatible endpoint?

Yes. Supported providers include OpenAI-, Anthropic-, Gemini-, and Qwen-compatible APIs, plus any third-party provider or local model via Ollama or vLLM. Because Qwen Code speaks the OpenAI-compatible protocol, you point it at any OpenAI-shaped endpoint by setting the base URL, API key, and model name, then configuring it with /auth.

Can I run Qwen3-Coder through Morph with Qwen Code?

Yes. Morph serves Qwen models on its production fleet (including morph-qwen35-397b at ~120 tok/s and morph-qwen36-27b) behind one OpenAI-compatible API at api.morphllm.com/v1. Point Qwen Code at that base URL with a Morph API key to run Qwen3-Coder-class models through Morph's router.

Related Resources

Run Qwen Models Through One OpenAI-Compatible API

Morph serves Qwen models on its fleet (morph-qwen35-397b at ~120 tok/s, morph-qwen36-27b) behind one OpenAI-compatible API at api.morphllm.com/v1. Point Qwen Code, or any OpenAI-shaped CLI, at the endpoint with a Morph API key. Add the router for 40-70% cost savings at ~$0.001 per classification.

Get a Morph API Key

View API Docs

Fast Apply

WarpGrep

Compact

Model Router

DeepSeek

MiniMax

Qwen

Glance

Blog

Startup Credits

Students

Contact Us

About

Careers