AI Agent Integration: Connect Agents to APIs, Tools & Data Sources (2026)

An AI agent that cannot reach the outside world is a chatbot. Integration is what turns a language model into an agent: connecting it to APIs, databases, file systems, and execution environments so it can take real actions. This guide covers the five integration patterns, authentication, error handling, rate limiting, and security, with TypeScript code for each.

Why Integration Is the Hard Part

Most agent failures in production are not model failures. They are integration failures. The API returned a 429 and the agent had no retry logic. The database query timed out and the agent hallucinated a response. The OAuth token expired mid-session. The tool response exceeded the context window and the model lost track of what it was doing.

LLM APIs themselves fail 1-5% of the time due to rate limits, timeouts, and server errors. External tool calls add their own failure modes. A production agent calling five tools per task has a compounding failure surface that no amount of prompt engineering can fix. The fix is engineering: retry logic, circuit breakers, auth management, and input validation.

1-5%

LLM API call failure rate

Integration patterns in production

429

Most common integration error

OAuth 2.1

Standard auth protocol

The Five Integration Patterns

Every agent integration falls into one of five patterns. They differ in abstraction level, who controls execution, and how tools are discovered. Production agents typically combine two or three.

	Function Calling	MCP	API Wrapper	Database/RAG	Sandbox
Abstraction	LLM-native	Protocol-level	Library-level	Query-level	Runtime-level
Who executes	Your code	MCP server	Your code	Database engine	Sandbox runtime
Discovery	Static schema	Dynamic server manifest	Hardcoded	Schema introspection	Filesystem access
Auth handling	Manual	Centralized	Per-wrapper	Connection string	Container isolation
Best for	1-10 tools	Enterprise, multi-tool	3rd-party APIs	Data retrieval	Code execution

Pattern 1: Function Calling

Function calling is how the model tells your code what to do. You define tool schemas as JSON. The model outputs a structured call with the tool name and arguments. Your code executes the function and returns the result. The model then reasons about the result and decides whether to make another call or respond to the user.

Every major LLM provider supports function calling: OpenAI (tools parameter), Anthropic (tools with input_schema), and Google (FunctionDeclaration). The schemas differ but the pattern is identical.

Define a tool schema (OpenAI format)

const tools = [
  {
    type: "function",
    function: {
      name: "query_database",
      description: "Execute a read-only SQL query against the analytics database",
      parameters: {
        type: "object",
        properties: {
          query: {
            type: "string",
            description: "SQL SELECT query. No mutations allowed.",
          },
          database: {
            type: "string",
            enum: ["analytics", "users", "logs"],
            description: "Target database",
          },
        },
        required: ["query", "database"],
        additionalProperties: false,
      },
      strict: true,
    },
  },
];

The agent loop: call model, execute tools, repeat

import OpenAI from "openai";

const client = new OpenAI();

async function agentLoop(messages: OpenAI.ChatCompletionMessageParam[]) {
  const MAX_ITERATIONS = 10;

  for (let i = 0; i < MAX_ITERATIONS; i++) {
    const response = await client.chat.completions.create({
      model: "gpt-4o",
      messages,
      tools,
    });

    const message = response.choices[0].message;
    messages.push(message);

    // No tool calls means the model is done
    if (!message.tool_calls?.length) {
      return message.content;
    }

    // Execute each tool call and append results
    for (const toolCall of message.tool_calls) {
      const args = JSON.parse(toolCall.function.arguments);
      const result = await executeTool(toolCall.function.name, args);

      messages.push({
        role: "tool",
        tool_call_id: toolCall.id,
        content: JSON.stringify(result),
      });
    }
  }

  throw new Error("Agent exceeded maximum iterations");
}

Keep tool count under 20

10-20 tools work well. Past 20, the model makes more mistakes choosing which tool to call. If you have 50+ tools, use a two-stage approach: let the model pick a category first, then show only that category's tools.

Anthropic's format uses input_schema instead of parameters, and arguments come pre-parsed as objects (no JSON.parse needed). Google uses Protocol Buffer-style types. The execution loop is the same across all providers: check for tool calls, execute, append results, call the model again.

Pattern 2: Model Context Protocol (MCP)

MCP standardizes how agents discover and invoke tools. Instead of hardcoding tool schemas in your application, an MCP server advertises its tools dynamically. The agent connects, discovers available tools, and invokes them through a standard protocol. This decouples the agent from its tools.

Anthropic created MCP in late 2024 and donated it to the Linux Foundation's Agentic AI Foundation (AAIF) in December 2025. OpenAI deprecated its Assistants API in favor of MCP, with a mid-2026 sunset. MCP is becoming the default protocol for AI agent tooling.

Dynamic Discovery

Tools are advertised by the MCP server at runtime. The agent doesn't need to know what tools exist at build time. New tools appear automatically when the server adds them.

Centralized Auth

The MCP server handles authentication with downstream services. The agent sends requests to the MCP server. The server manages API keys, OAuth flows, and token rotation.

Standard Transport

stdio for local servers (no network exposure), HTTP + SSE for remote servers. JSON-RPC 2.0 message format. Consistent across all MCP-compatible tools.

MCP server: expose a tool

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";

const server = new McpServer({
  name: "github-tools",
  version: "1.0.0",
});

server.tool(
  "search_issues",
  "Search GitHub issues by query, label, or assignee",
  {
    repo: z.string().describe("owner/repo format"),
    query: z.string().describe("Search query"),
    state: z.enum(["open", "closed", "all"]).default("open"),
    labels: z.array(z.string()).optional(),
    limit: z.number().max(100).default(20),
  },
  async ({ repo, query, state, labels, limit }) => {
    const issues = await searchGitHubIssues(repo, query, state, labels, limit);
    return {
      content: [{ type: "text", text: JSON.stringify(issues, null, 2) }],
    };
  }
);

const transport = new StdioServerTransport();
await server.connect(transport);

MCP client: connect and discover tools

import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";

const transport = new StdioClientTransport({
  command: "node",
  args: ["./github-mcp-server.js"],
});

const client = new Client({ name: "my-agent", version: "1.0.0" });
await client.connect(transport);

// Discover tools dynamically
const { tools } = await client.listTools();
// tools = [{ name: "search_issues", description: "...", inputSchema: {...} }]

// Invoke a tool
const result = await client.callTool("search_issues", {
  repo: "anthropics/anthropic-sdk-python",
  query: "streaming bug",
  state: "open",
  limit: 5,
});

MCP vs function calling

MCP and function calling are complementary. Function calling is how the LLM communicates tool invocations. MCP is how those tools are discovered and managed. In practice, the MCP client converts MCP tool definitions into the provider's function calling format (OpenAI, Anthropic, or Google) before passing them to the model.

Pattern 3: API Wrappers with Auth and Error Handling

Not every integration needs MCP. For a small number of well-known APIs, a typed wrapper with built-in auth management and error handling is simpler and more reliable. The wrapper handles token refresh, rate limit headers, retry logic, and response validation. The agent just calls the function.

API wrapper with auth, retry, and validation

interface ApiClientConfig {
  baseUrl: string;
  getToken: () => Promise<string>;  // token provider, not raw token
  maxRetries?: number;
  timeout?: number;
}

class ApiClient {
  private config: Required<ApiClientConfig>;
  private circuitOpen = false;
  private failureCount = 0;

  constructor(config: ApiClientConfig) {
    this.config = {
      maxRetries: 3,
      timeout: 10_000,
      ...config,
    };
  }

  async request<T>(path: string, options: RequestInit = {}): Promise<T> {
    if (this.circuitOpen) {
      throw new Error("Circuit breaker open: service unavailable");
    }

    for (let attempt = 0; attempt <= this.config.maxRetries; attempt++) {
      try {
        const token = await this.config.getToken();
        const controller = new AbortController();
        const timeoutId = setTimeout(
          () => controller.abort(),
          this.config.timeout
        );

        const response = await fetch(`${this.config.baseUrl}${path}`, {
          ...options,
          signal: controller.signal,
          headers: {
            Authorization: `Bearer ${token}`,
            "Content-Type": "application/json",
            ...options.headers,
          },
        });

        clearTimeout(timeoutId);

        // Permanent errors: don't retry
        if (response.status === 401 || response.status === 403) {
          throw new Error(`Auth failed: ${response.status}`);
        }
        if (response.status === 400 || response.status === 404) {
          const body = await response.text();
          throw new Error(`Client error ${response.status}: ${body}`);
        }

        // Rate limited: respect Retry-After header
        if (response.status === 429) {
          const retryAfter = response.headers.get("Retry-After");
          const delay = retryAfter ? parseInt(retryAfter) * 1000 : 2 ** attempt * 1000;
          await sleep(delay + Math.random() * 1000);  // jitter
          continue;
        }

        // Server error: retry with backoff
        if (response.status >= 500) {
          if (attempt < this.config.maxRetries) {
            await sleep(2 ** attempt * 1000 + Math.random() * 1000);
            continue;
          }
          throw new Error(`Server error after ${attempt + 1} attempts`);
        }

        this.failureCount = 0;
        return response.json() as T;
      } catch (error) {
        if (error instanceof Error && error.name === "AbortError") {
          if (attempt < this.config.maxRetries) {
            await sleep(2 ** attempt * 1000);
            continue;
          }
        }
        this.failureCount++;
        if (this.failureCount >= 5) {
          this.circuitOpen = true;
          setTimeout(() => { this.circuitOpen = false; }, 30_000);
        }
        throw error;
      }
    }
    throw new Error("Max retries exceeded");
  }
}

function sleep(ms: number) {
  return new Promise((resolve) => setTimeout(resolve, ms));
}

The key details: the getToken function is a provider, not a raw token string. This lets you swap in a token refresh mechanism without changing the client. The circuit breaker opens after 5 consecutive failures and stays open for 30 seconds, giving the downstream service time to recover. Jitter on retries prevents thundering herd when multiple agents hit the same service.

Pattern 4: Database Connections and RAG

Agents that answer questions about private data need database access. Two patterns dominate: direct SQL queries for structured data, and vector search (RAG) for unstructured documents.

SQL tool: let the agent query structured data

import { Pool } from "pg";

const pool = new Pool({ connectionString: process.env.DATABASE_URL });

// Tool definition for the LLM
const queryTool = {
  type: "function" as const,
  function: {
    name: "query_analytics",
    description: "Run a read-only SQL query against the analytics database. Returns up to 100 rows.",
    parameters: {
      type: "object",
      properties: {
        query: {
          type: "string",
          description: "PostgreSQL SELECT query. JOINs allowed. No INSERT/UPDATE/DELETE.",
        },
      },
      required: ["query"],
      additionalProperties: false,
    },
    strict: true,
  },
};

async function executeQuery(query: string): Promise<string> {
  // Guard: only allow SELECT
  const normalized = query.trim().toUpperCase();
  if (!normalized.startsWith("SELECT")) {
    return JSON.stringify({ error: "Only SELECT queries allowed" });
  }

  const client = await pool.connect();
  try {
    // Read-only transaction with timeout
    await client.query("SET statement_timeout = '5000'");
    await client.query("BEGIN READ ONLY");
    const result = await client.query(query);
    await client.query("COMMIT");

    // Truncate large results
    const rows = result.rows.slice(0, 100);
    const output = JSON.stringify(rows, null, 2);
    if (output.length > 8000) {
      return JSON.stringify({
        rows: rows.slice(0, 20),
        truncated: true,
        totalRows: result.rowCount,
      });
    }
    return output;
  } catch (error) {
    await client.query("ROLLBACK");
    return JSON.stringify({ error: (error as Error).message });
  } finally {
    client.release();
  }
}

Vector search tool: RAG for unstructured documents

import { OpenAI } from "openai";

const openai = new OpenAI();

const searchTool = {
  type: "function" as const,
  function: {
    name: "search_docs",
    description: "Semantic search across internal documentation. Returns the 5 most relevant passages.",
    parameters: {
      type: "object",
      properties: {
        query: {
          type: "string",
          description: "Natural language search query",
        },
        collection: {
          type: "string",
          enum: ["engineering-docs", "api-reference", "runbooks"],
        },
      },
      required: ["query", "collection"],
      additionalProperties: false,
    },
    strict: true,
  },
};

async function searchDocs(query: string, collection: string) {
  // 1. Embed the query
  const embedding = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: query,
  });
  const vector = embedding.data[0].embedding;

  // 2. Query the vector store (pgvector example)
  const results = await pool.query(
    `SELECT content, metadata, 1 - (embedding <=> $1::vector) AS similarity
     FROM documents
     WHERE collection = $2
     ORDER BY embedding <=> $1::vector
     LIMIT 5`,
    [JSON.stringify(vector), collection]
  );

  // 3. Return passages with source attribution
  return results.rows.map((row) => ({
    content: row.content,
    source: row.metadata.source,
    similarity: Math.round(row.similarity * 100) / 100,
  }));
}

Truncate before returning to the model

Large query results eat context window. A 50-row SQL result can be 10,000+ tokens. Always limit rows, truncate long values, and set a character budget on tool responses. 8,000 characters is a reasonable ceiling for a single tool result.

Pattern 5: Sandbox Execution

Agents that generate code need somewhere to run it. Running LLM-generated code on your host machine is a security risk. Sandbox environments provide isolated execution with their own filesystem, process space, and network policy.

Sandbox tool: execute generated code safely

interface SandboxResult {
  stdout: string;
  stderr: string;
  exitCode: number;
  files?: Record<string, string>;
}

const executeTool = {
  type: "function" as const,
  function: {
    name: "execute_code",
    description: "Execute TypeScript/JavaScript code in an isolated sandbox. Has access to Node.js 22 runtime. No network access. 30-second timeout.",
    parameters: {
      type: "object",
      properties: {
        code: {
          type: "string",
          description: "TypeScript or JavaScript code to execute",
        },
        language: {
          type: "string",
          enum: ["typescript", "javascript", "python"],
        },
      },
      required: ["code", "language"],
      additionalProperties: false,
    },
    strict: true,
  },
};

async function executeInSandbox(
  code: string,
  language: string
): Promise<SandboxResult> {
  // Start an isolated container
  const sandbox = await createSandbox({
    image: language === "python" ? "python:3.12-slim" : "node:22-slim",
    timeout: 30_000,
    memory: "256m",
    network: "none",  // no outbound network
  });

  try {
    // Write the code file
    const ext = language === "python" ? "py" : "ts";
    await sandbox.writeFile(`/tmp/main.${ext}`, code);

    // Execute
    const cmd = language === "python"
      ? "python3 /tmp/main.py"
      : "npx tsx /tmp/main.ts";
    const result = await sandbox.exec(cmd);

    return {
      stdout: result.stdout.slice(0, 4000),   // truncate
      stderr: result.stderr.slice(0, 2000),
      exitCode: result.exitCode,
    };
  } finally {
    await sandbox.destroy();
  }
}

The critical constraints: no network access (prevents data exfiltration), memory limits (prevents resource exhaustion), and timeouts (prevents infinite loops). Truncate stdout/stderr before returning to the model. A runaway console.log in a loop can produce megabytes of output that floods the context window.

Authentication Patterns

Authentication is the most common failure point in agent integrations. Tokens expire. Scopes are too broad. Secrets leak into conversation history. The patterns below handle these cases.

OAuth 2.1 + PKCE

For user-facing flows where the agent acts on behalf of a user. Authorization code flow with PKCE prevents token interception. Tokens are scoped to the specific actions the agent needs.

Client Credentials

For server-to-server communication where no user is involved. The agent authenticates as itself. Used for background jobs, cron-triggered agent runs, and system-level integrations.

Scoped Tokens

Each tool gets the minimum scopes required. A GitHub tool gets repo:read, not repo:admin. A database tool gets SELECT, not ALL PRIVILEGES. Lateral movement from a compromised tool is limited to that tool's scope.

Token Rotation

Long-running agents need automatic token refresh. Short-lived access tokens (15 min) with refresh tokens stored in a secrets manager. Never store tokens in agent context or conversation history.

Token provider with automatic refresh

class TokenProvider {
  private accessToken: string | null = null;
  private expiresAt = 0;

  constructor(
    private clientId: string,
    private clientSecret: string,
    private tokenUrl: string,
    private scopes: string[],
  ) {}

  async getToken(): Promise<string> {
    // Return cached token if still valid (with 60s buffer)
    if (this.accessToken && Date.now() < this.expiresAt - 60_000) {
      return this.accessToken;
    }

    const response = await fetch(this.tokenUrl, {
      method: "POST",
      headers: { "Content-Type": "application/x-www-form-urlencoded" },
      body: new URLSearchParams({
        grant_type: "client_credentials",
        client_id: this.clientId,
        client_secret: this.clientSecret,
        scope: this.scopes.join(" "),
      }),
    });

    if (!response.ok) {
      throw new Error(`Token refresh failed: ${response.status}`);
    }

    const data = await response.json();
    this.accessToken = data.access_token;
    this.expiresAt = Date.now() + data.expires_in * 1000;
    return this.accessToken!;
  }
}

// Usage: inject into API client
const githubTokens = new TokenProvider(
  process.env.GITHUB_CLIENT_ID!,
  process.env.GITHUB_CLIENT_SECRET!,
  "https://github.com/login/oauth/access_token",
  ["repo:read", "issues:write"],
);

const github = new ApiClient({
  baseUrl: "https://api.github.com",
  getToken: () => githubTokens.getToken(),
});

Error Handling and Retry Logic

Agents encounter non-deterministic failures that do not exist in traditional software: partial LLM responses, tool timeouts, context window overflow, and model unavailability. A layered approach handles each failure mode.

Error Type	Examples	Strategy	Retry?
Transient	429 rate limit, 503 service unavailable, timeout	Exponential backoff with jitter	Yes, up to 5 attempts
Permanent	401 unauthorized, 400 bad request, invalid schema	Fail fast, return error to model	No
Degraded	Partial results, slow response, stale data	Return with warning flag, let model decide	Optional
Model failure	Invalid tool call, hallucinated args, loop	Fallback model chain or human escalation	With different model

Resilient tool executor with error classification

type ErrorCategory = "transient" | "permanent" | "degraded";

function classifyError(status: number, error?: Error): ErrorCategory {
  if (status === 429 || status === 503 || status >= 500) return "transient";
  if (status === 401 || status === 403 || status === 400) return "permanent";
  if (error?.name === "AbortError") return "transient";
  return "permanent";
}

async function executeToolSafe(
  name: string,
  args: Record<string, unknown>
): Promise<{ success: boolean; result: string }> {
  try {
    const result = await executeTool(name, args);
    return { success: true, result: JSON.stringify(result) };
  } catch (error) {
    // Return errors as structured results so the model can reason about them
    const message = error instanceof Error ? error.message : "Unknown error";
    return {
      success: false,
      result: JSON.stringify({
        error: message,
        tool: name,
        suggestion: getSuggestion(message),
      }),
    };
  }
}

function getSuggestion(error: string): string {
  if (error.includes("rate limit")) {
    return "Rate limited. Wait 30 seconds and try again, or use a different approach.";
  }
  if (error.includes("timeout")) {
    return "Request timed out. Try a simpler query or break into smaller operations.";
  }
  if (error.includes("Auth failed")) {
    return "Authentication failed. This tool is currently unavailable.";
  }
  return "Operation failed. Try an alternative approach.";
}

The critical pattern: return errors as structured tool results, not exceptions. When the model receives { error: "Rate limited", suggestion: "Wait 30s" }, it can reason about the failure and choose an alternative. When it receives nothing (because the error was swallowed), it hallucinates a result.

Rate Limiting and Cost Control

Agents can burn through API quotas and LLM credits in minutes without guardrails. Rate limiting operates at three levels: per-tool (prevent any single integration from exhausting quotas), per-agent (cap total operations per execution), and per-dollar (halt when spending exceeds a threshold).

Token bucket rate limiter for agent tools

class TokenBucket {
  private tokens: number;
  private lastRefill: number;

  constructor(
    private capacity: number,   // max burst size
    private refillRate: number, // tokens per second
  ) {
    this.tokens = capacity;
    this.lastRefill = Date.now();
  }

  async acquire(cost: number = 1): Promise<boolean> {
    this.refill();

    if (this.tokens >= cost) {
      this.tokens -= cost;
      return true;
    }

    // Calculate wait time
    const deficit = cost - this.tokens;
    const waitMs = (deficit / this.refillRate) * 1000;
    await sleep(waitMs);
    this.refill();
    this.tokens -= cost;
    return true;
  }

  private refill() {
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000;
    this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);
    this.lastRefill = now;
  }
}

// Per-tool rate limits
const rateLimiters = {
  github: new TokenBucket(30, 5),     // 30 burst, 5/sec sustained
  database: new TokenBucket(10, 2),   // 10 burst, 2/sec sustained
  search: new TokenBucket(5, 1),      // 5 burst, 1/sec sustained
};

// Cost circuit breaker
class CostBreaker {
  private spent = 0;

  constructor(private maxDollars: number) {}

  track(tokens: number, costPer1k: number) {
    this.spent += (tokens / 1000) * costPer1k;
    if (this.spent > this.maxDollars) {
      throw new Error(`Cost limit exceeded: $${this.spent.toFixed(2)} > $${this.maxDollars}`);
    }
  }
}

const costBreaker = new CostBreaker(5.00); // halt at $5

The cost circuit breaker is important for autonomous agents. A model stuck in a retry loop can make hundreds of API calls in seconds. At $0.01-0.03 per 1K tokens, a runaway agent can accumulate $50+ in charges before a human notices. The breaker halts execution at a configurable threshold.

Security

Agent integrations expand the attack surface of your application. The model can be prompted to invoke tools in unintended ways. Tool responses can contain prompt injection payloads. Tokens can leak into conversation history.

Least Privilege

Each tool gets the minimum scopes required. Database tools get read-only access. File system tools get access to specific directories only. API tools get the narrowest OAuth scopes that cover their function.

Output Sanitization

Validate and sanitize all tool outputs before returning them to the model. Truncate oversized responses. Strip sensitive fields (tokens, passwords, PII) from API responses. Never pass raw database rows containing user data to the model without filtering.

Audit Logging

Log every tool invocation with: timestamp, tool name, arguments, response size, latency, and the agent session ID. This is not optional. When an agent does something unexpected, you need to reconstruct exactly what happened.

Input Validation

Validate tool arguments against their schema before execution, not just at the LLM level. A model can output valid JSON that fails business logic constraints. Check string lengths, numeric ranges, enum membership, and SQL injection patterns.

Tool execution with audit logging and validation

interface AuditEntry {
  timestamp: string;
  sessionId: string;
  tool: string;
  args: Record<string, unknown>;
  responseSize: number;
  latencyMs: number;
  success: boolean;
  error?: string;
}

async function executeToolAudited(
  sessionId: string,
  tool: string,
  args: Record<string, unknown>,
): Promise<string> {
  const start = Date.now();

  // 1. Validate against schema
  const schema = toolSchemas[tool];
  if (!schema) throw new Error(`Unknown tool: ${tool}`);

  const validation = validateArgs(schema, args);
  if (!validation.valid) {
    await writeAuditLog({
      timestamp: new Date().toISOString(),
      sessionId,
      tool,
      args,
      responseSize: 0,
      latencyMs: Date.now() - start,
      success: false,
      error: `Validation: ${validation.errors.join(", ")}`,
    });
    return JSON.stringify({ error: validation.errors });
  }

  // 2. Check rate limit
  const limiter = rateLimiters[tool as keyof typeof rateLimiters];
  if (limiter) await limiter.acquire();

  // 3. Execute
  try {
    const result = await executeTool(tool, args);
    const response = sanitizeOutput(JSON.stringify(result));

    await writeAuditLog({
      timestamp: new Date().toISOString(),
      sessionId,
      tool,
      args: redactSensitiveArgs(args),
      responseSize: response.length,
      latencyMs: Date.now() - start,
      success: true,
    });

    return response;
  } catch (error) {
    const message = error instanceof Error ? error.message : "Unknown";
    await writeAuditLog({
      timestamp: new Date().toISOString(),
      sessionId,
      tool,
      args: redactSensitiveArgs(args),
      responseSize: 0,
      latencyMs: Date.now() - start,
      success: false,
      error: message,
    });
    return JSON.stringify({ error: message });
  }
}

function sanitizeOutput(output: string): string {
  // Truncate oversized responses
  if (output.length > 8000) {
    return output.slice(0, 8000) + "\n[truncated]";
  }
  // Strip potential secrets
  return output.replace(
    /(?:sk|pk|key|token|secret|password)[_-]?[a-zA-Z0-9]{20,}/gi,
    "[REDACTED]"
  );
}

Pre-Built Integrations: Morph

Building agent integrations from scratch is engineering-intensive. Morph provides three pre-built integration components that coding agents need, available as MCP servers or SDK tools for Anthropic, OpenAI, Google, and Vercel AI SDK.

WarpGrep: Agentic Code Search

Runs in its own context window with 8 parallel tool calls per turn across 4 turns. Returns relevant file spans in under 6 seconds. On SWE-Bench Pro, makes Opus 15.6% cheaper and 28% faster than searching on its own.

Fast Apply: Code Edit Merging

Merges LLM-generated code edits at 10,500+ tokens per second with deterministic merge behavior. The reliability layer between your agent's output and your repository. Available via API or MCP.

Compact: Context Compression

Verbatim context compaction at 33,000 tokens per second. Preserves file paths, error codes, and code references while removing conversational filler. Keeps long-running agents under their context limit.

Add Morph tools to your agent via SDK

import Anthropic from "@anthropic-ai/sdk";
import Morph from "@morphllm/morphsdk";

const morph = new Morph({ apiKey: process.env.MORPH_API_KEY });
const anthropic = new Anthropic();

// Create WarpGrep as a tool for your agent
const warpgrepTool = morph.anthropic.createWarpGrepTool({
  repoRoot: "/path/to/your/repo",
  excludes: ["node_modules", "dist", ".git"],
});

// Use it in the standard function calling loop
const response = await anthropic.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 4096,
  tools: [warpgrepTool.definition, ...otherTools],
  messages: [{ role: "user", content: "Find all rate limiting logic in the codebase" }],
});

// Execute WarpGrep when the model calls it
for (const block of response.content) {
  if (block.type === "tool_use" && block.name === warpgrepTool.definition.name) {
    const result = await warpgrepTool.execute(block.input);
    // result contains relevant file spans, ready to pass back to the model
  }
}

Or add via MCP (one command)

npx -y @morphllm/morph-setup --morph-api-key YOUR_API_KEY

The MCP setup auto-configures for Claude Code, Cursor, Windsurf, Codex, Amp, OpenCode, and Antigravity. No code changes needed. The agent discovers WarpGrep, Fast Apply, and Compact as available tools and uses them when relevant.

Frequently Asked Questions

What is AI agent integration?

The process of connecting an AI agent to external systems so it can take real actions. Without integration, a model can only generate text. With integration, it can query databases, call APIs, execute code, and modify files. The five main patterns are function calling, MCP, API wrappers, database/RAG connections, and sandbox execution.

What is the difference between function calling and MCP?

Function calling is an LLM feature: you define tool schemas, the model outputs structured calls, your code executes them. MCP is a protocol that standardizes tool discovery and invocation across providers. They are complementary. MCP manages tools. Function calling is how the LLM invokes them. In practice, the MCP client converts tool definitions into the provider's function calling format before passing them to the model.

How do you handle authentication for agent integrations?

OAuth 2.1 with PKCE for user-facing flows. Client credentials for server-to-server. Scoped tokens with automatic rotation for long-running agents. Never share tokens between tools. Never store tokens in agent context or conversation history. Use a secrets manager and inject tokens at execution time.

How do you handle errors in agent integrations?

Classify errors into transient (retry with exponential backoff), permanent (fail fast), and degraded (fallback). Return errors as structured tool results so the model can reason about what happened. Never swallow errors silently. A model that receives nothing will hallucinate a result.

What are the most common agent integrations?

REST APIs for SaaS tools, databases (PostgreSQL, MySQL) for structured data, vector stores (Pinecone, pgvector) for semantic search, file systems for code and documents, browsers (Playwright) for web interaction, code execution sandboxes (E2B, Modal, Docker) for running generated code, and search engines for real-time information.

How do you rate limit AI agent API calls?

Three levels: per-tool limits prevent any single integration from exhausting quotas, per-agent limits cap total operations per execution, and cost circuit breakers halt when spending exceeds a threshold. Use token bucket algorithms for burst protection. Always parse 429 responses and Retry-After headers from upstream services.

Is MCP replacing direct API integration?

MCP is becoming the default protocol for AI agent tooling. OpenAI deprecated its Assistants API in favor of MCP. But MCP does not replace direct API integration. It standardizes the interface between the agent and its tools. The tool implementation still calls APIs directly. MCP adds discovery, auth management, and a standard transport layer on top.

Skip the Integration Work

Morph provides pre-built integrations for the hardest parts of coding agents: code search (WarpGrep), edit merging (Fast Apply), and context management (Compact). Available as MCP servers or SDK tools. One command to install, no integration code to maintain.

Try WarpGrep

View Docs

Morph Fast Apply

Morph WarpGrep

Morph Compact

Morph Glance

Morph MCP

Morph Monitor

Blog

Startup Credits

Students

Contact Us

About

Careers

AI Agent Integration: How to Connect Agents to APIs, Tools, and Data Sources