How Provider Selection Works
Codex CLI loads configuration from ~/.codex/config.toml at startup. Two keys control which API it talks to:
modelsets the model identifier (e.g.,gpt-5-codex)model_providerselects which provider definition to use (default:openai)
Provider definitions live under [model_providers.<id>] in the same file. Each definition specifies a base URL, authentication method, and wire protocol. Codex ships with two built-in providers: openai (using the Responses API) and oss (using the Chat Completions API for local models).
Minimal custom provider
# ~/.codex/config.toml
model = "gpt-5.1"
model_provider = "proxy"
[model_providers.proxy]
name = "OpenAI via LLM proxy"
base_url = "http://proxy.example.com"
env_key = "OPENAI_API_KEY"The env_key field tells Codex which environment variable holds the API key. Codex reads that variable at runtime and sends it as a Bearer token. You never put API keys directly in config.toml (though an experimental_bearer_token field exists, it is discouraged).
Configuration Precedence
CLI flags override profile settings, which override config.toml defaults. The hierarchy:
- Command-line flags (
-c,--model,--oss) - Profile-specific settings (
--profile lightweight) ~/.codex/config.tomldefaults
Configuration Reference
Every provider definition supports the following keys. Only base_url is required for most setups.
| Key | Type | Description |
|---|---|---|
| name | string | Display name shown in the CLI |
| base_url | string | API endpoint URL |
| env_key | string | Environment variable holding the API key |
| wire_api | "chat" | "responses" | Protocol: Chat Completions or Responses API |
| http_headers | map | Static headers added to every request |
| env_http_headers | map | Headers populated from env vars at runtime |
| query_params | map | Query parameters appended to requests (used by Azure) |
| request_max_retries | number | HTTP retry count (default: 4) |
| stream_idle_timeout_ms | number | SSE idle timeout in ms (default: 300000) |
| stream_max_retries | number | SSE stream retry count (default: 5) |
OPENAI_BASE_URL Shortcut
For quick one-off overrides without editing config.toml, set the OPENAI_BASE_URL environment variable. Codex reads this and overrides the default OpenAI endpoint for that session.
Quick override via environment variable
# Point Codex at a different endpoint for one session
export OPENAI_BASE_URL="https://your-proxy.example.com/v1"
codex "refactor the auth module"CLI Flags for Providers
Codex does not have a literal --provider flag. Provider selection happens through three mechanisms:
The --model flag
Overrides the configured model for a single session. Does not change the provider endpoint.
Override model per session
codex --model gpt-5.1 "explain this codebase"
codex -m gpt-5-codex "fix the failing tests"The --oss flag
Switches to the local open-source provider. Equivalent to -c model_provider="oss". Codex validates that Ollama (or LM Studio) is running before starting.
Use local models
# Use Ollama
codex --oss "write unit tests for utils.ts"
# Equivalent long form
codex -c model_provider='"oss"' "write unit tests for utils.ts"The -c / --config flag
Overrides any config.toml key for a single invocation. Values parse as JSON when possible. This is how you select a custom provider from the command line.
Override provider via -c flag
# Use your custom "morph" provider for this session
codex -c model_provider='"morph"' -c model='"morph-v3-fast"' "apply the diff"
# Switch to OpenRouter
codex -c model_provider='"openrouter"' -c model='"anthropic/claude-sonnet-4.5"' "review this PR"The --profile flag
Activates a named configuration profile. Profiles let you bundle model, provider, and other settings under a single name. More on this in the Profiles section.
Use a named profile
codex --profile fast-local "refactor this function"
codex -p openrouter "explain the auth flow"Provider Examples
Working configs for the providers developers ask about most. Each example is a complete ~/.codex/config.toml snippet you can paste and modify.
OpenRouter
Access 200+ models through one API. OpenRouter handles routing, fallback, and billing.
OpenRouter provider
model = "anthropic/claude-sonnet-4.5"
model_provider = "openrouter"
[model_providers.openrouter]
name = "OpenRouter"
base_url = "https://openrouter.ai/api/v1"
env_key = "OPENROUTER_API_KEY"
wire_api = "chat"Ollama (Local)
Run models locally with zero API costs. Requires Ollama installed and a model pulled.
Ollama provider
model = "qwen2.5-coder:32b"
model_provider = "ollama"
[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
wire_api = "chat"--oss vs custom Ollama provider
The --oss flag uses Codex's built-in OSS provider, which validates Ollama is running and auto-discovers models. Defining a custom [model_providers.ollama] section gives you more control over the base URL and headers, but skips the auto-validation. Use --oss for quick local sessions. Use a custom provider when you need to point at a remote Ollama instance or add custom headers.
Azure OpenAI
For teams on Azure with data residency or compliance requirements.
Azure OpenAI provider
model = "gpt-5-codex"
model_provider = "azure"
[model_providers.azure]
name = "Azure"
base_url = "https://YOUR_PROJECT.openai.azure.com/openai"
env_key = "AZURE_OPENAI_API_KEY"
wire_api = "responses"
query_params = { api-version = "2025-04-01-preview" }Mistral
Mistral provider
model = "codestral-latest"
model_provider = "mistral"
[model_providers.mistral]
name = "Mistral"
base_url = "https://api.mistral.ai/v1"
env_key = "MISTRAL_API_KEY"
wire_api = "chat"LM Studio (Local)
LM Studio provider
model = "qwen2.5-coder-32b"
model_provider = "lmstudio"
[model_providers.lmstudio]
name = "LM Studio"
base_url = "http://localhost:1234/v1"
wire_api = "chat"OpenAI Data Residency
For organizations that need API traffic routed through a specific region.
OpenAI data residency
model_provider = "openaidr"
[model_providers.openaidr]
name = "OpenAI Data Residency"
base_url = "https://us.api.openai.com/v1"
wire_api = "responses"Using Morph as a Provider
Morph specializes in fast code transformations. The morph-v3-fast model runs code edits at 10,500+ tokens per second, which makes it useful as a Codex provider for apply-heavy workflows where the bottleneck is edit speed, not planning.
Morph provider configuration
model = "morph-v3-fast"
model_provider = "morph"
[model_providers.morph]
name = "Morph"
base_url = "https://api.morphllm.com/v1"
env_key = "MORPH_API_KEY"
wire_api = "chat"Set the API key
export MORPH_API_KEY="your-api-key-here"
codex -c model_provider='"morph"' "apply this diff to all test files"Morph is not a general-purpose coding model. It excels at code editing, diff application, and file transformations. For planning and reasoning, pair it with a frontier model. See Codex pricing for how costs compare across providers.
When to use Morph with Codex
Use Morph when your workflow involves many sequential file edits, large-scale refactors, or diff application. Morph processes 10,500+ tok/s compared to 200-500 tok/s from general-purpose models. For exploratory coding, debugging, or architecture decisions, stick with GPT-5-Codex or Claude.
Profiles for Provider Switching
Profiles bundle model, provider, and other settings under a name. Instead of remembering -c flags, you switch with --profile.
Define profiles in config.toml
# Default: OpenAI
model = "gpt-5-codex"
model_provider = "openai"
# Fast local development
[profiles.local]
model = "qwen2.5-coder:32b"
model_provider = "ollama"
# OpenRouter for model variety
[profiles.openrouter]
model = "anthropic/claude-sonnet-4.5"
model_provider = "openrouter"
# Morph for bulk edits
[profiles.morph]
model = "morph-v3-fast"
model_provider = "morph"
# Provider definitions
[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
wire_api = "chat"
[model_providers.openrouter]
name = "OpenRouter"
base_url = "https://openrouter.ai/api/v1"
env_key = "OPENROUTER_API_KEY"
wire_api = "chat"
[model_providers.morph]
name = "Morph"
base_url = "https://api.morphllm.com/v1"
env_key = "MORPH_API_KEY"
wire_api = "chat"Switch profiles from the CLI
# Use local Ollama
codex --profile local "write tests for auth.ts"
# Use OpenRouter
codex -p openrouter "explain this error"
# Use Morph for fast edits
codex -p morph "rename all instances of userId to user_id"Profiles also support per-profile oss_provider settings. If you have both Ollama and LM Studio installed, you can assign different local backends to different profiles.
Troubleshooting
Authentication failures
The most common issue. Codex reads the API key from the environment variable specified in env_key. Verify the variable is set and exported in your shell.
Debug authentication
# Check if the env var is set
echo $OPENROUTER_API_KEY
# If empty, set it
export OPENROUTER_API_KEY="sk-or-v1-..."
# For persistent config, add to ~/.zshrc or ~/.bashrc
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.zshrcwire_api mismatch
If Codex sends requests the provider does not understand, check wire_api. OpenAI and Azure use "responses". Everything else (OpenRouter, Ollama, Mistral, Morph, LM Studio) uses "chat".
Model not found errors
The model value must match exactly what the provider expects. OpenRouter uses format provider/model-name (e.g., anthropic/claude-sonnet-4.5). Ollama uses the local model name (e.g., qwen2.5-coder:32b). OpenAI uses IDs like gpt-5-codex.
Streaming timeouts with local models
Large local models can be slow to generate the first token. Increase the idle timeout:
Increase timeout for slow local models
[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
wire_api = "chat"
stream_idle_timeout_ms = 600000 # 10 minutes
request_max_retries = 2Tool calling not supported
Codex relies on function/tool calling to read files, run commands, and edit code. Many smaller open-source models do not support tool calling. If Codex errors out or produces no file operations, the model likely lacks tool support. Stick to models that implement the OpenAI tool calling spec: GPT-5.x, Claude, Qwen 2.5 Coder (32B+), Mistral Large, and Codestral.
FAQ
What is the --provider flag in Codex CLI?
Codex CLI does not have a literal --provider flag. Provider selection uses -c model_provider="provider_name" or the model_provider key in ~/.codex/config.toml. The --oss flag is a shortcut equivalent to -c model_provider="oss" for local models via Ollama or LM Studio.
How do I use a different model with Codex?
Use the --model flag (codex --model gpt-5.1) to change the model for a single session. To change the provider endpoint, set model_provider in config.toml and define the provider under [model_providers.your_provider] with base_url and env_key.
Can Codex use OpenRouter?
Yes. Add a [model_providers.openrouter] section to ~/.codex/config.toml with base_url = "https://openrouter.ai/api/v1" and set wire_api = "chat". Then set model_provider = "openrouter" and model to any OpenRouter model ID like anthropic/claude-sonnet-4.5.
Can Codex use local models via Ollama?
Yes. Either use the built-in --oss flag (which validates Ollama is running) or define a custom provider with base_url = "http://localhost:11434/v1". Set oss_provider = "ollama" in config.toml to make --oss default to Ollama.
What is the wire_api setting?
wire_api controls which API protocol Codex uses to talk to the provider. Options are "chat" (OpenAI Chat Completions API) and "responses" (OpenAI Responses API). Most third-party providers use "chat". OpenAI's own endpoint and Azure use "responses".
How do I use Morph as a Codex provider?
Add a [model_providers.morph] section with base_url = "https://api.morphllm.com/v1", env_key = "MORPH_API_KEY", and wire_api = "chat". Set model_provider = "morph" and model to your chosen model. Morph's fast-apply models specialize in code editing at 10,500+ tokens per second.
Related Guides
- Codex Pricing Breakdown - Cost comparison across tiers and providers
- Use a Different LLM with Claude Code - Similar guide for Claude Code's provider system
- Codex vs Claude Code - Full feature and benchmark comparison
Speed Up Code Edits in Codex
Morph processes code transformations at 10,500+ tok/s. Add it as a Codex provider and cut edit latency by 20x.