Two tools that both watch your LLM calls, built on opposite premises: an async SDK you instrument by hand versus a one-line proxy that sits in the request path. This decides it on the numbers that move a purchase: every free tier limit, the real cost by seats and volume, the latency tradeoff, and the failures neither tool catches. Pricing verified against each vendor's published page as of June 2026.
TL;DR
Pick Helicone for one-line setup with no code changes, Apache-2.0 open source, free self-hosting, and flat $79/mo pricing with unlimited seats across 100+ models. Pick LangSmith if you are committed to LangChain or LangGraph, want first-party tracing with no request-path hop, and your seat count and trace volume stay modest. Helicone trades one network hop for zero setup; LangSmith trades instrumentation work for zero hop. The seat math favors Helicone the moment your team grows; the framework integration favors LangSmith if you never leave LangChain.
Quick Comparison
| Dimension | LangSmith | Helicone |
|---|---|---|
| License | Closed source | Apache-2.0 |
| Integration | SDK, async background flush (no hop) | Base-URL swap to gateway (one hop) |
| Free tier | 5k base traces/mo, 1 seat | 10k requests/mo, 1 GB, 1 seat, 7-day retention |
| First paid tier | Plus $39/seat/mo, 10k base traces | Pro $79/mo, unlimited seats, alerts, HQL |
| Overage | $2.50 per 1k base traces (14-day), $5 per 1k extended (400-day) | Included in Pro flat rate |
| Self-hosting | Enterprise plan only, custom pricing | Free: 5-service docker compose |
| Model coverage | First-party LangChain / LangGraph | Proxy in front of 100+ models |
| Trace store | Managed (closed) | ClickHouse |
| Higher tier | Enterprise / quote | Team $799/mo, 5 orgs, SOC-2 + HIPAA, 3-month retention |
Pricing and Free Tiers
The two price on different axes. LangSmith charges per seat and meters base traces: Plus is $39 per seat per month including 10k base traces, then overage runs $2.50 per 1k base traces at 14-day retention, or $5 per 1k for extended 400-day retention. Helicone charges a flat rate with unlimited seats: Pro is $79 per month for unlimited users, alerts, and HQL (its query language), with Team at $799 per month adding five orgs, SOC-2 and HIPAA, and 3-month retention.
The free tiers cover small projects on both sides: LangSmith Developer gives 5k base traces a month for one seat; Helicone gives 10k requests a month, 1 GB, one seat, and 7-day retention. LangSmith self-hosting is Enterprise-only with custom pricing; Helicone self-hosting is free because the project is Apache-2.0.
Where the seat math flips
A solo project is cheap on either tool. A two-person team is already cheaper on Helicone: $79 flat versus $78 for two LangSmith seats, before any trace overage. A ten-person team is $79 on Helicone versus $390 in LangSmith seats alone. Seats are free on Helicone and metered on LangSmith, so the gap widens with headcount, not just volume.
Proxy vs Async SDK: The Latency Tradeoff
This is the architectural fork, and it is worth stating precisely instead of as a slogan. Helicone's gateway mode is a proxy: you change the base URL of your LLM client so requests route through Helicone, which means one network hop on the critical path per request, in exchange for zero code changes. LangSmith's SDK is async: it queues trace events in memory and flushes them in the background, so there is no hop on the request path, in exchange for writing instrumentation into your code.
Whether the hop matters depends on what it is measured against. A single LLM generation runs for seconds; one extra network hop against a multi-second response is noise. Inside a sub-second pipeline, an embedding call or a routing classifier, that same hop is a real fraction of the budget. The honest rule: the proxy hop is free when the thing it wraps is slow and expensive when the thing it wraps is fast.
The fork is not absolute. Helicone also ships an async OpenLLMetry logging mode: you keep the dashboard but log out of band instead of proxying, which gives you the same no-hop tradeoff LangSmith's SDK makes, with the open-source license and 100+ model coverage on top. If the gateway hop is the only thing keeping you off Helicone, that mode removes it.
Open Source vs Closed
Helicone is Apache-2.0, so self-hosting is free: a five-service docker compose runs the Next.js web app, the Jawn log collector, Supabase, ClickHouse, and MinIO on your own infrastructure. That stack is not weightless, but it is yours, and there is no license cost. The project migrated its analytics store from Postgres to ClickHouse, which cut query times from over 100 seconds to about 0.5 seconds, and the repo sits at roughly 5,800 GitHub stars.
LangSmith is closed source. Self-hosting it inside your own VPC is an Enterprise-plan capability with custom annual pricing, not something you spin up from a public repo. If running the tool on your own infrastructure at no license cost is a hard requirement, that requirement alone decides it for Helicone.
When LangSmith Wins
- You are all-in on LangChain or LangGraph and want first-party tracing with no glue code.
- You cannot accept a request-path hop, so the async SDK's background flush matters.
- You want Prompt Hub, annotation queues, and native LangGraph spans as part of the product, not assembled.
- Your seat count and trace volume stay modest, so per-seat and per-trace pricing never compounds.
When Helicone Wins
- You want one-line setup with no code changes: swap the base URL and you are tracing.
- You call many providers and want one proxy in front of 100+ models, not a framework-bound SDK.
- You want open source (Apache-2.0) and the option to self-host for free.
- Your team is more than a couple of people, where flat $79/mo with unlimited seats beats $39 per head.
The Third Option: Own the Stack
There is a path neither vendor advertises, and a growing number of teams take it: instrument with OpenTelemetry, store the spans in your own ClickHouse, and skip the per-seat and per-trace meters entirely. It is not exotic. Helicone itself runs on ClickHouse. The instrumentation layer (OpenLLMetry, Apache-2.0, OpenTelemetry-based, roughly 7,200 GitHub stars, free) and the dashboard (Grafana OSS, free) cost nothing; a small ClickHouse Cloud Basic starts around $66/mo, or self-hosting ClickHouse (Apache-2.0) is free. OpenLLMetry's Show HN laid out the OpenTelemetry-native pitch, and in an r/LangChain thread from this month the top reply was blunt: "instrument everything via native OpenTelemetry so you can swap backends." Teams also tend to drift toward their existing stack rather than adopt a new dashboard. We walk through the whole build in build your own LLM observability.
What Both Miss: Semantic Signals
Everything above measures the mechanics of a call. None of it measures the meaning. A response that quotes the wrong refund policy returns a 200 with normal latency and a normal token count. A user who is quietly getting angry produces the same log line as a delighted one. An agent stuck in a three-step loop looks like an agent doing work. The trace is green and the product is broken.
These failures are semantic, so the fix is a label on the content of each turn: is_user_frustrated, stuck-in-a-loop, leaked-thinking, jailbreak, or a signal specific to your product. Both LangSmith and Helicone approximate this with LLM-as-judge evals, which run offline on samples. A Morph Reflex is a classifier that returns the label inline, in under 90 milliseconds, cheap enough to run on every turn rather than a sample, then write back onto the LangSmith trace or the Helicone request as an attribute. The base model is morph-reflex-v1, built-in signals include jailbreak, guardrail, leaked-thinking, stuck-in-a-loop, incomplete-thought, ambiguity, difficulty, and domain, and you can train a custom signal in under an hour.
Score a turn, then attach it to your trace
curl -X POST "https://api.morphllm.com/v1/reflex/predict" \
-H "Authorization: Bearer $MORPH_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model": "stuck-in-a-loop", "text": "<the agent turn>"}'
# {
# "model": "stuck-in-a-loop",
# "mode": "single_label",
# "classes": [
# { "class_id": 0, "label": "progressing", "score": 0.04, "selected": false },
# { "class_id": 1, "label": "looping", "score": 0.96, "selected": true }
# ],
# "inference_time_ms": 88
# }The predicted label is the class with selected: true; there is no top-level label or confidence field, so you read the winner off the classes array. The result comes back as an API response, not a dashboard panel, so it composes with whichever tool you picked: write it onto the LangSmith span, attach it to the Helicone request, alert on it in Slack, or route on it inline. It complements a tracing platform; it does not replace one.
Frequently Asked Questions
LangSmith vs Helicone: what is the core difference?
Integration model. LangSmith is an async SDK you wire into your code with no request-path hop; Helicone is a one-line gateway in front of 100+ models that adds one hop. See the latency tradeoff.
Is Helicone open source and LangSmith not?
Yes. Helicone is Apache-2.0 with free self-hosting (a 5-service docker compose); LangSmith is closed source with Enterprise-only self-hosting. See open source vs closed.
LangSmith vs Helicone: which is cheaper?
It turns on seats. LangSmith Plus is $39 per seat; Helicone Pro is $79 flat with unlimited seats. A two-person team is already cheaper on Helicone, and the gap grows with headcount. See pricing and free tiers.
Do LangSmith or Helicone catch wrong answers and frustrated users?
No. Both record structure (prompts, responses, latency, spans); neither labels meaning. Catching wrong answers, frustration, or looping needs a per-turn classifier on top, covered in semantic signals.
Related comparisons
LangSmith Alternatives
Seven alternatives by use case: Langfuse, Helicone, Phoenix, Braintrust, Weave, plus the OpenTelemetry + ClickHouse DIY route.
Langfuse Alternatives
When MIT-core Langfuse isn't the fit: LangSmith, Helicone, Phoenix, Braintrust, and self-hosting on your own ClickHouse.
Langfuse vs LangSmith
MIT-core and self-hostable vs first-party LangChain. Full pricing math: 1M traces costs $101/mo on Langfuse, $2,514/mo on LangSmith.
Langfuse vs Helicone
Two open-source paths: Langfuse's SDK + ClickHouse stack vs Helicone's drop-in gateway. Both run on ClickHouse; you can too.
Braintrust vs LangSmith
Eval-first scoring vs trace-first monitoring. Where per-score billing beats per-trace, and where it doesn't.
Build Your Own LLM Observability
OpenTelemetry + Traceloop + ClickHouse, the stack the vendors run. Own your traces from ~$66/mo and run Reflexes for the signals traces miss.
Add the layer the trace cannot see
Whichever platform you pick, Reflexes returns a semantic label on every turn in under 90 milliseconds, over an API that composes with your traces.