Codex Provider Configuration: How to Use --provider, config.toml, and Custom Endpoints

Configure Codex CLI to use OpenRouter, Azure, Ollama, Mistral, or any OpenAI-compatible provider. Full config.toml reference with working examples.

March 4, 2026 · 1 min read

How Provider Selection Works

Codex CLI loads configuration from ~/.codex/config.toml at startup. Two keys control which API it talks to:

  • model sets the model identifier (e.g., gpt-5-codex)
  • model_provider selects which provider definition to use (default: openai)

Provider definitions live under [model_providers.<id>] in the same file. Each definition specifies a base URL, authentication method, and wire protocol. Codex ships with two built-in providers: openai (using the Responses API) and oss (using the Chat Completions API for local models).

Minimal custom provider

# ~/.codex/config.toml
model = "gpt-5.1"
model_provider = "proxy"

[model_providers.proxy]
name = "OpenAI via LLM proxy"
base_url = "http://proxy.example.com"
env_key = "OPENAI_API_KEY"

The env_key field tells Codex which environment variable holds the API key. Codex reads that variable at runtime and sends it as a Bearer token. You never put API keys directly in config.toml (though an experimental_bearer_token field exists, it is discouraged).

Configuration Precedence

CLI flags override profile settings, which override config.toml defaults. The hierarchy:

  1. Command-line flags (-c, --model, --oss)
  2. Profile-specific settings (--profile lightweight)
  3. ~/.codex/config.toml defaults

Configuration Reference

Every provider definition supports the following keys. Only base_url is required for most setups.

KeyTypeDescription
namestringDisplay name shown in the CLI
base_urlstringAPI endpoint URL
env_keystringEnvironment variable holding the API key
wire_api"chat" | "responses"Protocol: Chat Completions or Responses API
http_headersmapStatic headers added to every request
env_http_headersmapHeaders populated from env vars at runtime
query_paramsmapQuery parameters appended to requests (used by Azure)
request_max_retriesnumberHTTP retry count (default: 4)
stream_idle_timeout_msnumberSSE idle timeout in ms (default: 300000)
stream_max_retriesnumberSSE stream retry count (default: 5)

OPENAI_BASE_URL Shortcut

For quick one-off overrides without editing config.toml, set the OPENAI_BASE_URL environment variable. Codex reads this and overrides the default OpenAI endpoint for that session.

Quick override via environment variable

# Point Codex at a different endpoint for one session
export OPENAI_BASE_URL="https://your-proxy.example.com/v1"
codex "refactor the auth module"

CLI Flags for Providers

Codex does not have a literal --provider flag. Provider selection happens through three mechanisms:

The --model flag

Overrides the configured model for a single session. Does not change the provider endpoint.

Override model per session

codex --model gpt-5.1 "explain this codebase"
codex -m gpt-5-codex "fix the failing tests"

The --oss flag

Switches to the local open-source provider. Equivalent to -c model_provider="oss". Codex validates that Ollama (or LM Studio) is running before starting.

Use local models

# Use Ollama
codex --oss "write unit tests for utils.ts"

# Equivalent long form
codex -c model_provider='"oss"' "write unit tests for utils.ts"

The -c / --config flag

Overrides any config.toml key for a single invocation. Values parse as JSON when possible. This is how you select a custom provider from the command line.

Override provider via -c flag

# Use your custom "morph" provider for this session
codex -c model_provider='"morph"' -c model='"morph-v3-fast"' "apply the diff"

# Switch to OpenRouter
codex -c model_provider='"openrouter"' -c model='"anthropic/claude-sonnet-4.5"' "review this PR"

The --profile flag

Activates a named configuration profile. Profiles let you bundle model, provider, and other settings under a single name. More on this in the Profiles section.

Use a named profile

codex --profile fast-local "refactor this function"
codex -p openrouter "explain the auth flow"

Provider Examples

Working configs for the providers developers ask about most. Each example is a complete ~/.codex/config.toml snippet you can paste and modify.

OpenRouter

Access 200+ models through one API. OpenRouter handles routing, fallback, and billing.

OpenRouter provider

model = "anthropic/claude-sonnet-4.5"
model_provider = "openrouter"

[model_providers.openrouter]
name = "OpenRouter"
base_url = "https://openrouter.ai/api/v1"
env_key = "OPENROUTER_API_KEY"
wire_api = "chat"

Ollama (Local)

Run models locally with zero API costs. Requires Ollama installed and a model pulled.

Ollama provider

model = "qwen2.5-coder:32b"
model_provider = "ollama"

[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
wire_api = "chat"

--oss vs custom Ollama provider

The --oss flag uses Codex's built-in OSS provider, which validates Ollama is running and auto-discovers models. Defining a custom [model_providers.ollama] section gives you more control over the base URL and headers, but skips the auto-validation. Use --oss for quick local sessions. Use a custom provider when you need to point at a remote Ollama instance or add custom headers.

Azure OpenAI

For teams on Azure with data residency or compliance requirements.

Azure OpenAI provider

model = "gpt-5-codex"
model_provider = "azure"

[model_providers.azure]
name = "Azure"
base_url = "https://YOUR_PROJECT.openai.azure.com/openai"
env_key = "AZURE_OPENAI_API_KEY"
wire_api = "responses"
query_params = { api-version = "2025-04-01-preview" }

Mistral

Mistral provider

model = "codestral-latest"
model_provider = "mistral"

[model_providers.mistral]
name = "Mistral"
base_url = "https://api.mistral.ai/v1"
env_key = "MISTRAL_API_KEY"
wire_api = "chat"

LM Studio (Local)

LM Studio provider

model = "qwen2.5-coder-32b"
model_provider = "lmstudio"

[model_providers.lmstudio]
name = "LM Studio"
base_url = "http://localhost:1234/v1"
wire_api = "chat"

OpenAI Data Residency

For organizations that need API traffic routed through a specific region.

OpenAI data residency

model_provider = "openaidr"

[model_providers.openaidr]
name = "OpenAI Data Residency"
base_url = "https://us.api.openai.com/v1"
wire_api = "responses"

Using Morph as a Provider

Morph specializes in fast code transformations. The morph-v3-fast model runs code edits at 10,500+ tokens per second, which makes it useful as a Codex provider for apply-heavy workflows where the bottleneck is edit speed, not planning.

Morph provider configuration

model = "morph-v3-fast"
model_provider = "morph"

[model_providers.morph]
name = "Morph"
base_url = "https://api.morphllm.com/v1"
env_key = "MORPH_API_KEY"
wire_api = "chat"

Set the API key

export MORPH_API_KEY="your-api-key-here"
codex -c model_provider='"morph"' "apply this diff to all test files"

Morph is not a general-purpose coding model. It excels at code editing, diff application, and file transformations. For planning and reasoning, pair it with a frontier model. See Codex pricing for how costs compare across providers.

When to use Morph with Codex

Use Morph when your workflow involves many sequential file edits, large-scale refactors, or diff application. Morph processes 10,500+ tok/s compared to 200-500 tok/s from general-purpose models. For exploratory coding, debugging, or architecture decisions, stick with GPT-5-Codex or Claude.

Profiles for Provider Switching

Profiles bundle model, provider, and other settings under a name. Instead of remembering -c flags, you switch with --profile.

Define profiles in config.toml

# Default: OpenAI
model = "gpt-5-codex"
model_provider = "openai"

# Fast local development
[profiles.local]
model = "qwen2.5-coder:32b"
model_provider = "ollama"

# OpenRouter for model variety
[profiles.openrouter]
model = "anthropic/claude-sonnet-4.5"
model_provider = "openrouter"

# Morph for bulk edits
[profiles.morph]
model = "morph-v3-fast"
model_provider = "morph"

# Provider definitions
[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
wire_api = "chat"

[model_providers.openrouter]
name = "OpenRouter"
base_url = "https://openrouter.ai/api/v1"
env_key = "OPENROUTER_API_KEY"
wire_api = "chat"

[model_providers.morph]
name = "Morph"
base_url = "https://api.morphllm.com/v1"
env_key = "MORPH_API_KEY"
wire_api = "chat"

Switch profiles from the CLI

# Use local Ollama
codex --profile local "write tests for auth.ts"

# Use OpenRouter
codex -p openrouter "explain this error"

# Use Morph for fast edits
codex -p morph "rename all instances of userId to user_id"

Profiles also support per-profile oss_provider settings. If you have both Ollama and LM Studio installed, you can assign different local backends to different profiles.

Troubleshooting

Authentication failures

The most common issue. Codex reads the API key from the environment variable specified in env_key. Verify the variable is set and exported in your shell.

Debug authentication

# Check if the env var is set
echo $OPENROUTER_API_KEY

# If empty, set it
export OPENROUTER_API_KEY="sk-or-v1-..."

# For persistent config, add to ~/.zshrc or ~/.bashrc
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.zshrc

wire_api mismatch

If Codex sends requests the provider does not understand, check wire_api. OpenAI and Azure use "responses". Everything else (OpenRouter, Ollama, Mistral, Morph, LM Studio) uses "chat".

Model not found errors

The model value must match exactly what the provider expects. OpenRouter uses format provider/model-name (e.g., anthropic/claude-sonnet-4.5). Ollama uses the local model name (e.g., qwen2.5-coder:32b). OpenAI uses IDs like gpt-5-codex.

Streaming timeouts with local models

Large local models can be slow to generate the first token. Increase the idle timeout:

Increase timeout for slow local models

[model_providers.ollama]
name = "Ollama"
base_url = "http://localhost:11434/v1"
wire_api = "chat"
stream_idle_timeout_ms = 600000  # 10 minutes
request_max_retries = 2

Tool calling not supported

Codex relies on function/tool calling to read files, run commands, and edit code. Many smaller open-source models do not support tool calling. If Codex errors out or produces no file operations, the model likely lacks tool support. Stick to models that implement the OpenAI tool calling spec: GPT-5.x, Claude, Qwen 2.5 Coder (32B+), Mistral Large, and Codestral.

FAQ

What is the --provider flag in Codex CLI?

Codex CLI does not have a literal --provider flag. Provider selection uses -c model_provider="provider_name" or the model_provider key in ~/.codex/config.toml. The --oss flag is a shortcut equivalent to -c model_provider="oss" for local models via Ollama or LM Studio.

How do I use a different model with Codex?

Use the --model flag (codex --model gpt-5.1) to change the model for a single session. To change the provider endpoint, set model_provider in config.toml and define the provider under [model_providers.your_provider] with base_url and env_key.

Can Codex use OpenRouter?

Yes. Add a [model_providers.openrouter] section to ~/.codex/config.toml with base_url = "https://openrouter.ai/api/v1" and set wire_api = "chat". Then set model_provider = "openrouter" and model to any OpenRouter model ID like anthropic/claude-sonnet-4.5.

Can Codex use local models via Ollama?

Yes. Either use the built-in --oss flag (which validates Ollama is running) or define a custom provider with base_url = "http://localhost:11434/v1". Set oss_provider = "ollama" in config.toml to make --oss default to Ollama.

What is the wire_api setting?

wire_api controls which API protocol Codex uses to talk to the provider. Options are "chat" (OpenAI Chat Completions API) and "responses" (OpenAI Responses API). Most third-party providers use "chat". OpenAI's own endpoint and Azure use "responses".

How do I use Morph as a Codex provider?

Add a [model_providers.morph] section with base_url = "https://api.morphllm.com/v1", env_key = "MORPH_API_KEY", and wire_api = "chat". Set model_provider = "morph" and model to your chosen model. Morph's fast-apply models specialize in code editing at 10,500+ tokens per second.

Related Guides

Speed Up Code Edits in Codex

Morph processes code transformations at 10,500+ tok/s. Add it as a Codex provider and cut edit latency by 20x.