Langfuse vs LangSmith (2026): Pricing Math, Self-Host, and Lock-In Settled

The most searched head-to-head in LLM observability, decided on the numbers that move a purchase: every free tier limit, the cost at 1M events a month, the real self-host footprint of each, and the failures neither tool catches. Pricing verified against each vendor's published page as of June 2026.

$101/mo

Langfuse Core at 1M events

$2,514/mo

LangSmith Plus at 1M base traces

MIT vs closed

Langfuse core repo vs LangSmith

TL;DR

Pick Langfuse for open source, free self-hosting, unlimited seats on a $29/mo plan, and far lower cost past 100k events a month. Pick LangSmith if you are committed to LangChain or LangGraph and want first-party tracing, Prompt Hub, and annotation queues with zero assembly, and your volume stays modest. The cost curve favors Langfuse the moment you scale; the integration curve favors LangSmith if you never leave its framework.

Quick Comparison

Langfuse vs LangSmith (June 2026)

Dimension	Langfuse	LangSmith
License	MIT core (ee/ under enterprise license)	Closed source
Free tier	50k units/mo, 2 users, 30-day access	5k base traces/mo, 1 seat
First paid tier	Core $29/mo, 100k units, unlimited users	Plus $39/seat/mo, 10k base traces
Overage	$8 per 100k units (down to $6 at 50M+)	$2.50 per 1k base traces (14-day), $5 per 1k extended (400-day)
Self-hosting	Free: docker compose, Helm, or Terraform	Enterprise plan only, custom pricing
Trace store	ClickHouse (+ Postgres, Redis, S3)	Managed (closed)
Ecosystem fit	Framework-agnostic	First-party LangChain / LangGraph
Retention	30d Hobby, 90d Core, 3yr Pro ($199/mo)	14d base, 400d extended

Pricing and the Cost at Volume

Both meter differently, which is where the gap opens. Langfuse bills units (any ingested event: a trace, an observation, or a score). LangSmith bills base traces. The quantities are not 1:1, so the honest comparison fixes a monthly event count and runs each pricing model against it.

At 1M events per month: Langfuse Core is $29 plus nine increments of $8 per 100k = $101/mo, unlimited users. LangSmith Plus is $39 for the seat plus 990k traces of overage at $2.50 per 1k = $2,514/mo, for one seat, at 14-day retention. The caveat that keeps this honest: a Langfuse unit is one ingested event, so a single agent trace with multiple observations and scores burns multiple units. Even assuming several units per trace, the order-of-magnitude gap survives.

The reddit consensus

The cost cliff is the most cited reason teams move. In an r/LangChain thread that started the moment LangSmith left free, the top reply was simply "We self-host Langfuse and are pretty happy so far," with another commenter calling free self-hosting "a key requirement." The direct vs thread lands the same way: "opensource and built using otel so less risk of vendor lock in with langfuse."

Self-Hosting Footprint

Free self-hosting is Langfuse's structural advantage, but it is not weightless. The v3 architecture splits transactional data (Postgres), analytics (ClickHouse), queues and cache (Redis/Valkey), and event payloads (S3-compatible storage) into separate services, run as web and worker containers. That separation is why it scales, and also why it is heavier than a single-binary tool. If you cannot run ClickHouse, that one constraint rules out the self-host path.

What each takes to run yourself

	Langfuse	LangSmith
Stack	Web + worker, Postgres, ClickHouse, Redis/Valkey, S3	Runs in your VPC (managed image)
Deploy	docker compose, Helm, or AWS/Azure/GCP Terraform	Enterprise plan only
License cost	$0 (MIT core)	Custom annual pricing

Worth knowing: ClickHouse acquired Langfuse in January 2026, so the analytics engine under Langfuse and the company shipping it are now one and the same.

Lock-In and OpenTelemetry

The lock-in question is really an instrumentation question: if you rip the tool out in a year, do you re-instrument your codebase? Langfuse ingests OpenTelemetry spans, so a team can instrument once with OTel and keep Langfuse as a swappable backend. LangSmith is closed and first-party; its deepest value (Prompt Hub, native LangGraph tracing) is also what binds you to the LangChain stack. Neither is wrong, but they pull in opposite directions: Langfuse toward portability, LangSmith toward integration.

When LangSmith Wins

You are all-in on LangChain or LangGraph and want first-party tracing with no glue code.
You want Prompt Hub and annotation queues as part of the product, not assembled.
Your trace volume stays modest, so per-trace overage never compounds into the four-figure range.

When Langfuse Wins

You want open source (MIT core) and the option to self-host for free.
You need unlimited seats without paying per head ($29/mo Core).
Your volume is past ~100k events a month, where the $8-per-100k curve crushes $2.50-per-1k.
You run more than one framework, or none, and want framework-agnostic tracing.

The Third Option: Own the Stack

There is a path neither vendor advertises, and a growing number of teams take it: instrument with OpenTelemetry, store the spans in your own ClickHouse, and skip the per-trace meter entirely. It is not exotic. Langfuse runs on ClickHouse, Helicone migrated to ClickHouse, and SigNoz is built on it. The instrumentation layer (OpenLLMetry, Apache-2.0) and the dashboard (Grafana OSS) are free; a small ClickHouse Cloud starts around $66/mo. In an r/LangChain thread from this month, the top reply was blunt: "instrument everything via native OpenTelemetry so you can swap backends when you inevitably get frustrated." We walk through the whole build in build your own LLM observability.

What Both Miss: Semantic Signals

Everything above measures the mechanics of a call. None of it measures the meaning. A response that quotes the wrong refund policy returns a 200 with normal latency and a normal token count. A user who is quietly getting angry produces the same span as a delighted one. An agent stuck in a three-step loop looks like an agent doing work. The trace is green and the product is broken.

These failures are semantic, so the fix is a label on the content of each turn: is_user_frustrated, stuck-in-a-loop, leaked-thinking, jailbreak, or a signal specific to your product. Both Langfuse and LangSmith approximate this with LLM-as-judge evals, which run offline on samples. A Morph Reflex is a classifier that returns the label inline, in under 90 milliseconds, cheap enough to run on every turn rather than a sample, then write back onto the Langfuse or LangSmith span as an attribute.

Score a turn, then attach it to your trace

curl -X POST "https://api.morphllm.com/v1/reflex/predict" \
  -H "Authorization: Bearer $MORPH_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "stuck-in-a-loop", "text": "<the agent turn>"}'

# {
#   "model": "stuck-in-a-loop",
#   "mode": "single_label",
#   "classes": [
#     { "class_id": 0, "label": "progressing", "score": 0.04, "selected": false },
#     { "class_id": 1, "label": "looping",     "score": 0.96, "selected": true }
#   ],
#   "inference_time_ms": 88
# }

The label comes back as an API response, not a dashboard panel, so it composes with whichever tool you picked: write it onto the span, alert on it in Slack, or route on it inline. It complements a tracing platform; it does not replace one.

Frequently Asked Questions

Langfuse vs LangSmith: which is cheaper?

Langfuse, by a wide margin at volume. At 1M events a month Langfuse Core is about $101 with unlimited users; LangSmith Plus is about $2,514 for one seat. The pricing section shows the math and the unit-vs-trace caveat.

Is Langfuse open source and LangSmith not?

Yes. Langfuse's core repo is MIT (ee/ folders under a separate license); LangSmith is closed source with Enterprise-only self-hosting.

Should I pick LangSmith if I use LangChain?

Often yes, if your volume stays modest: it is first-party to LangChain and LangGraph. Past ~100k traces a month, or if you need self-hosting or open source, Langfuse wins. See when Langfuse wins.

Do Langfuse or LangSmith catch wrong answers and frustrated users?

No. Both record structure (prompts, responses, latency, spans); neither labels meaning. Catching wrong answers, frustration, or looping needs a per-turn classifier on top, covered in semantic signals.

Add the layer the trace cannot see

Whichever platform you pick, Reflexes returns a semantic label on every turn in under 90 milliseconds, over an API that composes with your traces.

Read the Reflexes docs

Build your own stack

Kimi K3

GLM-5.2

Qwen

MiniMax

DeepSeek

Reflex

Fast Apply

WarpGrep

Compact

Model Router

Blog

Startup Credits

Contact Us

About

Careers

Langfuse vs LangSmith (2026): The Pricing Math, Self-Host Footprint, and Lock-In, Settled

TL;DR

Quick Comparison

Pricing and the Cost at Volume

Self-Hosting Footprint

Lock-In and OpenTelemetry

When LangSmith Wins

When Langfuse Wins

The Third Option: Own the Stack

What Both Miss: Semantic Signals

Score a turn, then attach it to your trace

Frequently Asked Questions

Langfuse vs LangSmith: which is cheaper?

Is Langfuse open source and LangSmith not?

Should I pick LangSmith if I use LangChain?

Do Langfuse or LangSmith catch wrong answers and frustrated users?

Related comparisons

Braintrust Alternatives

LangSmith Alternatives

Langfuse Alternatives

Arize Phoenix vs Langfuse

Braintrust vs LangSmith

Braintrust vs Langfuse

Add the layer the trace cannot see