You standardized on Claude, and your token bill grows with every developer you add. Most of those calls are routine and do not need Opus. They need Haiku. Both Not Diamond and Morph Router solve this, with different designs. Not Diamond trains a learned router across 60+ models from many providers. Morph Router classifies prompt difficulty in ~430ms and routes within a provider family, including an Anthropic-only mode that stays inside Haiku 4.5 / Sonnet 4.5 / Opus. Below: where each one fits, with exact mechanics and prices.
The Problem: Routine Prompts on Frontier Models
A coding agent standardized on Claude sends every request to whatever model it is configured for, usually Sonnet or Opus. But most prompts in a session are routine: adding imports, writing a docstring, renaming a symbol, generating a boilerplate test. These produce the same output on Haiku 4.5 ($1/$5 per M tokens) as on Opus ($5/$25 per M tokens). The difference is the bill.
On coding-agent traffic, 60 to 80 percent of requests are routine. Without routing, all of them hit the expensive model. A team running thousands of developer sessions a month is paying frontier prices for work a cheap model handles identically. The fix is a router that downgrades the easy prompts and reserves the frontier model for the 15 to 20 percent of prompts that actually need it.
Two products do this. The decision between them is not "which is better" in the abstract. It is which routing model matches your stack: a learned cross-provider router that you train on your own data, or a difficulty classifier that routes inside the provider family you already standardized on.
The routing decision is a resource-allocation decision
This is the same pattern every distributed system uses: match the resource to the task. HTTP load balancers route by path, query planners choose between index and full scans, CDNs route by geography. LLM routing applies it to model selection. The open question is how you decide, and whether that decision survives the next model release. See how automatic model routing works.
What Not Diamond Does Well
Not Diamond is a learned model router for coding agents. Its docs describe the router as a meta-model: you supply prompts, candidate model responses, and evaluation scores, and it learns which model produces the best answer per query type for the lowest cost. The premise is sound. Rarely does one model win on every query, so a router that picks per prompt can beat any single model on the cost-quality frontier.
Where Not Diamond is genuinely strong:
Cross-provider breadth
Routes across 60+ models from many providers. If you want one router to pick the best model across the whole market (Claude, GPT, Gemini, open models), that breadth is the design point.
Trains on your data
A custom router learns from your prompts and your evaluation scores, so it can fit your specific workload and even route to your own custom models included in the eval set.
Out-of-the-box router + OpenRouter Auto
Ships a ready router across its catalog and powers OpenRouter's Auto mode, so you can start routing without supplying training data first.
If your goal is a single router that arbitrages the entire model market and you can produce labeled evaluation data, Not Diamond is built for exactly that. The tradeoffs below are about a different goal: routing inside one provider family with no training step.
How the Two Routers Differ
The core difference is what each router predicts. Not Diamond predicts a specific model identity, learned from your evaluation data across a candidate set. Morph Router predicts a difficulty class, then maps that class to a model tier you configure. One is a learned model picker; the other is a difficulty classifier with a configurable tier map.
That single design choice cascades into three practical differences: whether you need training data and retraining, whether you can keep routing inside one provider, and whether you can read the decision. Each is covered below.
Morph Router: classify difficulty, route within a family
// Morph Router returns a difficulty class AND a model from
// the provider family you ask for. No training data required.
const { model, difficulty } = await morph.routers.anthropic.selectModel({
input: "Add a TODO comment above fetchUsers",
mode: 'balanced'
})
// → { model: "claude-haiku-4-5", difficulty: "easy" }
// Hard prompt, same call, same family:
// "Debug this race condition in the distributed lock"
// → { model: "claude-opus-4", difficulty: "hard" }
// Then YOU make the call with your own Anthropic key:
const response = await anthropic.messages.create({ model, messages })Morph Router vs Not Diamond
The two routers overlap on the goal (spend less by routing) and diverge on the method. This table maps the axes a buyer on Claude actually cares about.
| Morph Router | Not Diamond | |
|---|---|---|
| Routing method | Difficulty classifier (easy/medium/hard/needs_info) | Learned meta-router trained on your eval data |
| Routing scope | Within a provider family (Anthropic, OpenAI, Google) | Across 60+ models from many providers |
| Anthropic-only routing | Yes (Haiku 4.5 / Sonnet 4.5 / Opus) | Not the design focus (cross-provider) |
| Retraining on a new model | None (remap config) | Required (retrain with override=True) |
| Training data to start | None | Custom router needs >=15 samples; out-of-box router needs none |
| Explainability | Explicit difficulty class returned, inspectable | Model choice from a trained network |
| Latency added | ~430ms, hides behind request prep | Per-request routing call |
| Pricing model | Flat $0.001 per classification | Fixed fee per M tokens (rate not public) |
| Proxies the LLM call | No (recommends, you call) | Integrates via API with your gateway/harness |
The highlighted cells are where a Claude-standardized team feels the difference: Anthropic-only routing, no retraining, an explainable decision, and a flat per-request price you can forecast.
Anthropic-Only Routing
If your team is standardized on Claude, the goal is not to find the cheapest model in the entire market. It is to route between Claude tiers without leaving the Anthropic family. You keep one SDK, one billing relationship, one set of safety and behavior characteristics, and you still cut the bill by sending routine work to Haiku.
Morph's Anthropic router returns Anthropic model names only. Easy prompts get Haiku 4.5, medium prompts get Sonnet 4.5, hard prompts get Opus. The output of the router is always a Claude model, so nothing downstream changes: the same @anthropic-ai/sdk call, the same tool-use format, the same response shape.
| Difficulty | Claude model | Input / Output per M |
|---|---|---|
| Easy | Haiku 4.5 | $1 / $5 |
| Medium | Sonnet 4.6 | $3 / $15 |
| Hard | Opus 4.6 | $5 / $25 |
| Needs Info | Return to user | $0 |
A cross-provider learned router optimizes a different objective: it will happily route your prompt to a non-Claude model if its trained scores say that model is cheaper or better. That is correct behavior for a market-wide router, and the wrong behavior for a team that chose Claude on purpose. The Anthropic-only constraint is a feature, not a limitation, when the constraint is your requirement.
Same idea for OpenAI and Google teams
The provider-family constraint is not Anthropic-specific. morph.routers.openai.selectModel() returns GPT-5-mini / GPT-5. morph.routers.google.selectModel() returns Gemini 2.5 Flash / Pro. Pick the family you standardized on and route inside it.
Retraining When a New Model Ships
Model releases are frequent. Anthropic, OpenAI, and Google each ship new tiers multiple times a year. How a router handles a new model determines how much ongoing work it costs you.
A learned router predicts model identity, so a model it was not trained on is invisible to it. Not Diamond's docs are explicit: to route to a newly added model you retrain. You append new data rows (prompts, that model's responses, evaluation scores) and call the training function again with the same preference ID and override=True. Training runs from a couple of minutes up to an hour, and you cannot use the router until it completes. So every model release that matters to you triggers a data-collection and retraining cycle.
Morph Router predicts difficulty, not model identity. The classifier never sees a model name. When Anthropic ships a new tier, you change one line of config to map a difficulty level to the new model. The classifier is untouched, there is no data to collect, and there is no training wait.
| Step | Morph Router | Not Diamond (custom router) |
|---|---|---|
| Collect labeled data | Not needed | Required (prompts + responses + scores) |
| Training run | None | Minutes to ~1 hour |
| Router usable during | Immediately | Blocked until training completes |
| Change to apply | One config remap | Append rows + retrain (override=True) |
For a market-wide router this retraining cost can be worth it: the router is squeezing the last few percent of cost-quality across dozens of models, and that requires per-model evidence. For a team that just wants routine prompts on Haiku and hard prompts on Opus, the retraining cycle is overhead that buys nothing.
Explainable Routing Decisions
When a router downgrades a prompt and the answer is worse than expected, you need to know why. Morph Router returns the difficulty classification alongside the model, so the decision is legible: this prompt was classified easy, therefore it went to Haiku. You can log the class, audit the distribution, set thresholds, and override specific cases.
A learned meta-router emits a model choice from a trained network. The choice reflects patterns in the training data, which is powerful but opaque per request. You see the output (use model X) without a first-class reason you can act on. For debugging a regression or explaining a routing decision to a teammate, the explicit difficulty class is easier to reason about.
Inspectable class
Every Morph routing decision carries easy / medium / hard / needs_info. Log it, chart the distribution per repo or per developer, and tune thresholds from real data.
needs_info instead of a guess
An ambiguous prompt like 'fix it' returns needs_info, so you ask the user instead of misrouting tokens. A model-identity router has no equivalent abstain signal.
Pricing Models
The two products price the routing layer differently, and the difference matters for forecasting.
Morph Router charges a flat $0.001 per classification request, independent of prompt size or which model gets selected. 100 routing calls cost $0.10. One million cost $1,000. The cost is predictable per request and trivial against the model savings: routing one easy 4,000-token prompt from Opus to Haiku saves far more than $0.001.
Not Diamond does not publish a per-request or per-token rate. Its pricing page lists an Early Access plan (free to try) and an Enterprise plan (custom pricing), and states it charges a small fixed fee per million tokens that is cheaper than the cheapest LLM. That is a token-metered model, so the routing cost scales with prompt and response size rather than being a flat per-decision fee. The exact rate is not disclosed publicly, which makes precise forecasting harder until you talk to their team.
| Morph Router | Not Diamond | |
|---|---|---|
| Unit | Per classification request | Per million tokens |
| Rate | $0.001 flat | Fixed fee per M tokens (not public) |
| Scales with token volume | No | Yes |
| Public price | Yes | No (sales-led) |
| Plans | Usage-based | Early Access (free) + Enterprise (custom) |
Neither is "cheaper" in the abstract. A flat per-request fee is easier to forecast and dominates at large token sizes; a per-token fee can be cheaper on tiny prompts. The point for a buyer is that Morph's number is published and fixed, so you can model your routing cost before you integrate. See the full cost-optimization breakdown.
Which One to Pick
| Your situation | Better fit | Why |
|---|---|---|
| Standardized on Claude, want Haiku/Sonnet/Opus routing | Morph Router | Anthropic-only routing, no provider switch, $0.001/req |
| Want one router across the whole model market | Not Diamond | Learned router across 60+ models from many providers |
| No labeled eval data, want to start today | Morph Router | Difficulty classifier needs no training data |
| Have rich eval data, want max cost-quality arbitrage | Not Diamond | Custom router trained on your scored prompts |
| Frequent model releases, want zero retraining | Morph Router | Classifier is model-agnostic; remap config only |
| Need to explain/audit each routing decision | Morph Router | Returns explicit difficulty class per request |
| Want flat, published, forecastable routing cost | Morph Router | $0.001 per classification, public |
| Already using OpenRouter Auto | Not Diamond | It powers Auto mode; you may already be on it |
Migrating from a cross-provider router to Anthropic-only routing
import { morph } from 'morph'
import Anthropic from '@anthropic-ai/sdk'
const anthropic = new Anthropic()
async function chat(userQuery: string, history: Message[]) {
// Classify difficulty, get a Claude model back. No training,
// no retraining when Anthropic ships a new tier.
const { model, difficulty } = await morph.routers.anthropic.selectModel({
input: userQuery,
mode: 'balanced' // 'aggressive' pushes more prompts to Haiku
})
if (difficulty === 'needs_info') return askUserToClarify(userQuery)
return anthropic.messages.create({
model, // claude-haiku-4-5 | claude-sonnet-4-6 | claude-opus-4-6
max_tokens: 4096,
messages: [...history, { role: 'user', content: userQuery }],
})
}For multi-agent systems, the routing decision also applies at the architecture level: assign the frontier model to planning turns and a cheaper model to execution turns. See multi-agent model routing for the planner-executor pattern and the cost math.
Frequently Asked Questions
What is the best Not Diamond alternative in 2026?
It depends on what you want from a router. If you are standardized on one provider (for example Claude) and want to route between its tiers automatically, Morph Router is the closest fit: it routes Anthropic-only between Haiku 4.5, Sonnet 4.5, and Opus, classifies prompts in ~430ms, needs no retraining when a new model ships, and costs $0.001 per classification. If you want a single router across 60+ models from many providers and can supply labeled training data, Not Diamond is built for that cross-provider case.
How does Not Diamond's router work?
Not Diamond trains a learned router, described in its docs as a meta-model. You supply prompts, candidate model responses, and evaluation scores, and it learns which model answers best per query type for the lowest cost. It also ships an out-of-the-box router across 60+ models and powers OpenRouter's Auto mode. To route to your own or newly added models, you train a custom router (minimum 15 samples, up to 10,000 samples or 5MB per training job).
Does Morph Router require retraining when a new model is released?
No. Morph Router classifies prompt difficulty (easy, medium, hard, needs_info), not model identity. When Anthropic ships a new Claude tier, you map a difficulty level to the new model name in config; the classifier is untouched. Not Diamond's learned router requires retraining to route to a newly added model: its docs say you append new data rows and call the training function again with override=True.
Can Morph Router route between Claude models only?
Yes. Morph's Anthropic router returns Anthropic model names only: Haiku 4.5 for easy prompts, Sonnet 4.5 for medium, Opus for hard. Teams on Claude cut cost without leaving the Anthropic family, changing SDKs, or sending prompts through another provider's models.
How much does Not Diamond cost?
Not Diamond does not publish a per-request or per-token price. Its pricing page lists an Early Access plan (free to try) and an Enterprise plan (custom pricing), and states it charges a small fixed fee per million tokens that is cheaper than the cheapest LLM. The exact rate is not disclosed publicly. Morph Router is a flat $0.001 per classification request.
Is Morph Router's routing decision explainable?
Yes. Morph Router returns an explicit difficulty classification (easy, medium, hard, or needs_info) alongside the recommended model. You can log it, audit the distribution, and override it. A learned meta-router emits a model choice from a trained network, which is harder to inspect per request.
Does either router proxy my LLM request?
Morph Router does not proxy the LLM call. It classifies the prompt and returns a model recommendation; your application makes the actual request with your own provider SDK and keys. Not Diamond integrates through its API and works with your model gateway and harness of choice, returning a routing decision you act on.
How much can model routing save on a Claude bill?
On coding-agent workloads, 60 to 80 percent of requests are routine. Routing those to Haiku 4.5 ($1/$5 per M) instead of Opus ($5/$25 per M) or Sonnet ($3/$15 per M), while keeping hard prompts on the frontier model, saves 40 to 70 percent depending on your prompt distribution, with under 2 percent quality loss on hard tasks.
Related Resources
Route Your Claude Bill, Without Leaving Claude
Morph Router classifies prompt difficulty in ~430ms and routes Anthropic-only between Haiku 4.5, Sonnet 4.5, and Opus. No training data, no retraining when a new model ships, an explainable difficulty class on every decision, and a flat $0.001 per classification. 40-70% savings on routine coding traffic.
