Agent Trace: The Open Spec for AI Code Attribution, Explained

Cursor's Agent Trace spec lets any tool record which lines were written by AI, which by humans. Backed by Cognition, Cloudflare, Vercel, Google Jules, Amp, and OpenCode. Here is how it works.

March 4, 2026 · 1 min read

The Problem: git blame Doesn't Know About AI

Run git blame on any file in a modern codebase. Every line is attributed to a human committer. But SemiAnalysis estimates that AI-generated code already accounts for ~4% of all public GitHub commits, trending toward 20% by end of 2026. Claude Code alone writes 135,000 commits per day.

The attribution metadata is wrong. A commit authored by jane@company.com might contain 200 lines she wrote, 150 lines Claude generated, and 50 lines she edited after Claude suggested them. git blame treats all 400 as Jane's. This causes three concrete problems:

Debugging

When AI-generated code breaks, developers waste time tracing logic they didn't write. Without knowing a function was AI-generated, they can't check if the original prompt was flawed or if the model hallucinated.

Code Review

Reviewers apply the same scrutiny to human and AI code. AI-generated code often needs different review patterns: checking for hallucinated APIs, missing edge cases, and subtle context misunderstandings.

Compliance

Regulated industries need to know what percentage of a codebase was AI-generated. SOC 2 auditors, DORA compliance officers, and security teams have no standard way to answer this question today.

The Scale of the Problem

Cognition measured that coding agents spend 60% of their time searching for context. When an agent modifies code another agent wrote, it has no way to access the original conversation, reasoning, or intent. Agent Trace fixes this by linking every code range to the conversation that produced it.

What Agent Trace Records

Agent Trace is a data specification, not a product. It defines a JSON-based "trace record" that connects code ranges to the conversations and contributors behind them. The core hierarchy: a trace record contains files, files contain conversations, conversations contain line ranges.

Each trace record captures:

  • File paths and the specific line ranges that were modified
  • Contributor type for each range: human, AI, mixed, or unknown
  • Model identification following the models.dev convention (e.g., anthropic/claude-opus-4-5-20251101)
  • Conversation URLs linking to the full chat context that produced the code
  • Content hashes (Murmur3) enabling attribution tracking when code moves between files
  • VCS metadata tying the trace to a specific commit, supporting Git, Jujutsu, Mercurial, and Subversion
  • Vendor-namespaced metadata for tool-specific data (e.g., dev.cursor.workspace_id)

Storage is implementation-defined. The spec does not mandate files, git notes, or databases. Implementations choose where traces live. The MIME type is application/vnd.agent-trace.record+json.

Trace Record Schema

A minimal valid trace record:

Minimal Agent Trace Record

{
  "version": "0.1.0",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2026-01-25T10:00:00Z",
  "files": [
    {
      "path": "src/app.ts",
      "conversations": [
        {
          "contributor": { "type": "ai" },
          "ranges": [{ "start_line": 1, "end_line": 50 }]
        }
      ]
    }
  ]
}

A production trace record with full attribution:

Full Agent Trace Record with VCS, Model ID, and Content Hashing

{
  "version": "0.1.0",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "timestamp": "2026-01-23T14:30:00Z",
  "vcs": {
    "type": "git",
    "revision": "a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0"
  },
  "tool": {
    "name": "cursor",
    "version": "2.4.0"
  },
  "files": [
    {
      "path": "src/utils/parser.ts",
      "conversations": [
        {
          "url": "https://api.cursor.com/v1/conversations/12345",
          "contributor": {
            "type": "ai",
            "model_id": "anthropic/claude-opus-4-5-20251101"
          },
          "ranges": [
            {
              "start_line": 42,
              "end_line": 67,
              "content_hash": "murmur3:9f2e8a1b"
            }
          ],
          "related": [
            {
              "type": "session",
              "url": "https://api.cursor.com/v1/sessions/67890"
            }
          ]
        }
      ]
    }
  ],
  "metadata": {
    "confidence": 0.95,
    "dev.cursor": {
      "workspace_id": "ws-abc123"
    }
  }
}

Key Design Decisions

  • Line numbers are 1-indexed and refer to positions at the recorded revision, not current positions. Consumers use VCS blame to map to current state.
  • Content hashes are optional but enable position-independent attribution. When code moves to a different file or line range, the hash still identifies it.
  • Ranges group by conversation, reducing cardinality. One conversation producing edits across 10 line ranges stores the contributor metadata once.
  • Vendor metadata uses reverse-domain notation (dev.cursor, com.github.copilot) to prevent key collisions without a central registry.

The Four Contributor Types

Agent Trace classifies every attributed code range into one of four types:

human

Code authored directly by a developer. The traditional case. No AI involvement in generation.

ai

Code generated by an AI system. The contributor object includes an optional model_id field identifying the specific model (e.g., anthropic/claude-opus-4-5-20251101).

mixed

Human-edited AI output, or AI-edited human code. The most common case in practice: a developer accepts an AI suggestion then tweaks variable names or adds error handling.

unknown

Origin cannot be determined. Useful for retroactive attribution of existing codebases where the generation method wasn't tracked.

The mixed type is where Agent Trace diverges most from a binary AI-or-human model. In real workflows, most code is collaborative. A developer prompts Claude, accepts 80% of the output, rewrites a function signature, and adds a null check. That is mixed, and tracking it accurately matters for understanding how teams actually use AI tools.

Context Graphs: Why Cognition Thinks This Is Bigger Than Attribution

Cognition (the team behind Devin) published a companion blog post arguing that Agent Trace is the first building block for "context graphs" in code. Their argument: modern development has shifted from bandwidth constraints (the problem Git solved in 2005) to context constraints. The bottleneck is not moving code between machines. It is preserving the reasoning behind code changes.

A context graph is a living record of decision traces stitched across entities and time, making precedent searchable. Foundation Capital calls this "AI's trillion-dollar opportunity," predicting the next major platforms will be systems of record for decisions, not systems of record for data.

Cognition reports two concrete performance gains from exposing prior reasoning to models:

+3 pts
SWE-Bench improvement when models access prior conversation context
40-80%
Cache hit rate improvement from structured context traces

The practical implication: when Agent A modifies code that Agent B wrote, Agent A can follow the conversation URL in the trace record to understand why those lines exist. Without this, Agent A treats the code as context-free text and is more likely to introduce regressions. With it, Agent A has the original prompt, the reasoning, and the constraints that shaped the code.

From Attribution to Understanding

Cognition frames this as a shift from "code is a commodity" to "context is the new precious resource." If code generation costs approach zero (and with models like Claude writing 135K commits/day, it is heading there), the value shifts to understanding why code was written the way it was. Agent Trace captures that why.

Who Backs It

Agent Trace is not a Cursor-only initiative. The RFC launched with backing from:

Cursor

Publisher of the spec. Reference implementation in their editor. 620+ GitHub stars on the repo.

Cognition

Devin's team. Published the context graphs companion post. Focused on agent-to-agent context passing.

Cloudflare

Infrastructure backing. Potential for edge-based trace storage and retrieval at scale.

Vercel

Deployment platform integration. Could surface AI attribution in deploy previews and PR checks.

Google Jules

Google's coding agent. Cross-vendor support signals this is not just an IDE war play.

Amp

AI coding agent with Agent Trace support built in.

OpenCode

Open-source AI coding agent. Community-driven implementation of the spec.

git-ai

Git extension for AI attribution. Native integration with existing Git workflows.

The spec is released under CC BY 4.0. The GitHub repo (cursor/agent-trace) has 620+ stars and 48 forks with 15 open issues, most discussing edge cases around merges and rebases.

What Agent Trace Does Not Do

The spec explicitly lists four non-goals:

No Legal Ownership

Agent Trace does not define who owns AI-generated code. It records attribution metadata. Legal interpretation is left to organizations and their counsel.

No Training Data Provenance

The spec does not track what training data produced a model's output. That is a model-level concern, not a code-attribution concern.

No Quality Assessment

Agent Trace does not rate AI-generated code. It does not flag bugs, security issues, or hallucinations. It records origin, not quality.

No UI Requirements

The spec defines data, not presentation. Editors, CI tools, and dashboards can render traces however they want. The spec is deliberately UI-agnostic.

This restraint is deliberate. By limiting scope to attribution data, the spec avoids the political and legal landmines that would slow adoption. A compliance team and a solo developer can both use the same trace format for different purposes.

Open Questions (Why It's Still an RFC)

Agent Trace v0.1.0 is explicitly a Request for Comments. The spec leaves several problems unresolved:

  • Merge conflicts: When two branches with trace records merge, how should conflicting attributions resolve? The spec does not define merge semantics.
  • Rebases: Rebasing rewrites commit history. Trace records tied to old commit SHAs become orphaned. Jujutsu's stable Change IDs help here, but Git's SHA model does not.
  • Large-scale agent changes: When an agent modifies 500 files in a single conversation, the trace record grows proportionally. The spec does not define size limits or compression strategies.
  • Retroactive attribution: How do you attribute existing codebases written before Agent Trace existed? The unknown contributor type is a placeholder, not a solution.
  • Mixed granularity: If a developer accepts an AI suggestion then changes one character, is the range ai or mixed? The threshold is undefined.

These are solvable problems, but solving them requires production usage data. Publishing as an RFC first and collecting implementation feedback is the right sequence.

What This Means for Coding Agents

Agent Trace matters for the same reason that structured logging mattered a decade ago. When every service emitted unstructured text logs, debugging distributed systems was archaeology. When the industry converged on structured formats (JSON logs, OpenTelemetry traces), observability tooling exploded.

AI code generation is in the "unstructured logs" phase. Every tool tracks attribution differently, or not at all. Claude Code adds itself as a Git co-author. Copilot has no attribution mechanism. Cursor stores conversation history internally. None of these are interoperable.

If Agent Trace reaches adoption, the tooling implications are significant:

CI/CD Integration

Pull request checks that surface AI attribution percentages. 'This PR is 73% AI-generated, review accordingly.' Block merges that exceed AI-generation thresholds in regulated codebases.

Agent-to-Agent Context

When a coding agent modifies code another agent wrote, it follows the conversation URL to understand the original intent. Cognition reports 3-point SWE-Bench improvements from this context passing.

Security Auditing

Trace which model generated security-sensitive code. If a vulnerability is found in AI-generated code, trace back to the exact model version, conversation, and prompt that produced it.

Analytics Dashboards

Spenser Skates (Amplitude) announced they're building analytics on Agent Trace: AI code contribution tracking, model performance dashboards, and agents that surface issues from data.

The Subagent Connection

Multi-agent workflows amplify the attribution problem. When 16 Claude agents collaborate on a 100K-line compiler (as demonstrated in February 2026), a single commit may contain code from multiple models, multiple conversations, and multiple human reviewers. Without structured attribution, the provenance of each line is lost. Agent Trace's conversation-level grouping handles this by design: each agent's conversation gets its own trace entry within the same file.

For tools like Morph that power the subagent layer of coding agents, structured attribution is infrastructure. When a Morph Fast Apply call merges AI-generated edits into a file at 10,500+ tokens per second, the trace record captures exactly which ranges were AI-generated and which model produced them. This makes the entire apply-review-debug cycle traceable.

Frequently Asked Questions

What is Agent Trace?

An open specification (v0.1.0, RFC) published by Cursor in January 2026. It defines a vendor-neutral JSON format for recording which lines of code were written by AI, which by humans, and which are mixed. Backed by Cognition, Cloudflare, Vercel, Google Jules, Amp, and OpenCode. Released under CC BY 4.0.

How is Agent Trace different from git blame?

git blame shows who last modified a line. Agent Trace shows whether that modification came from a human, an AI model (with a specific model ID like anthropic/claude-opus-4-5-20251101), or a mix of both. It also links changes to the conversation that produced them, so reviewers can trace the full reasoning behind AI-generated code.

Where are trace records stored?

The spec is deliberately storage-agnostic. Implementations can store traces as local files, git notes, database entries, or any other mechanism. This flexibility is intentional: different teams have different infrastructure constraints.

Does Agent Trace determine code ownership or copyright?

No. The spec explicitly excludes legal ownership, copyright tracking, training data provenance, and quality assessment from its scope. It records attribution metadata only. Legal interpretation is left to organizations.

What version control systems does it support?

Git, Jujutsu (jj), Mercurial (hg), and Subversion (svn). Jujutsu's stable Change IDs are particularly well-suited because they survive rebases, solving one of Git's core attribution problems.

What are context graphs?

A knowledge representation framework that maps relationships between code, conversations, decisions, and contributors. Cognition argues Agent Trace is the first step toward context graphs for code. Foundation Capital calls context graphs "AI's trillion-dollar opportunity," predicting the next major platforms will be systems of record for decisions.

Is Agent Trace production-ready?

Not yet. Version 0.1.0 is an RFC with open questions around merges, rebases, and large-scale agent changes. Cursor includes a reference implementation, but the spec explicitly invites feedback before finalizing. Production adoption will require resolving the merge semantics and storage conventions that the RFC leaves open.

Build Faster Coding Agents with Morph

Morph powers the subagent layer that coding agents depend on. Fast Apply merges edits at 10,500+ tok/s. WarpGrep searches codebases 4x faster. Both produce the structured output that Agent Trace can attribute.