Docker Sandbox: Running AI Agents in Isolated Docker Environments (2026)

Why Docker Containers Are Not Sandboxes

Docker containers isolate processes using Linux namespaces and cgroups. They control what a process can see (its own PID namespace, mount namespace, network namespace) and how much it can consume (CPU shares, memory limits). This is sufficient for running trusted application code on shared infrastructure. It is not sufficient for running untrusted code from an AI agent.

The core problem: every container on a host shares the same Linux kernel. A vulnerability in the kernel is a vulnerability in every container on that machine. This is not a theoretical concern. CVE-2022-0185 allowed container escape through a heap overflow in the filesystem context code. CVE-2019-5736 allowed overwriting the host runc binary from inside a container. CVE-2016-5195 (Dirty COW) allowed privilege escalation from any container to root on the host.

Shared

Host kernel in standard Docker

Major container escape CVEs since 2016

All

Containers affected per kernel vuln

Misconfiguration compounds the risk. Containers running with --privileged, with CAP_SYS_ADMIN, or with the Docker socket mounted (/var/run/docker.sock) give attackers a direct path to the host. AI agents that install packages, run build tools, and execute arbitrary code are exactly the workload where these risks matter most.

Container Escape Is Not Exotic

Palo Alto's Unit 42 research team documented that standard containers "are not truly sandboxed as they share the host OS kernel." CyberArk demonstrated kernel exploitation paths from container to host root. These are not edge cases. Any workload running untrusted code in a standard container is one kernel CVE away from full host compromise.

Docker Sandboxes: The MicroVM Approach

Docker Sandboxes, released in March 2026, takes a different approach. Instead of running agents inside containers that share the host kernel, it runs each agent in a lightweight microVM with its own dedicated Linux kernel. The product ships as a standalone CLI called sbx and does not require Docker Desktop.

Each sandbox is an isolated environment: its own kernel, its own Docker daemon, its own filesystem, its own network stack. An agent running inside a sandbox can build Docker images, install system packages, modify files, and run any command it wants. None of this touches the host. If the agent does something destructive, the sandbox is destroyed and the host is unaffected.

Dedicated

Kernel per sandbox (microVM)

Supported agents (Claude, Gemini, Codex, etc.)

API keys exposed inside sandbox

The supported agents at launch: Claude Code, Gemini CLI, GitHub Copilot CLI, OpenAI Codex, OpenCode, Kiro, Docker Agent, and a generic shell mode. Docker Sandboxes also works with autonomous systems like OpenClaw and NanoClaw.

What Makes It Different from Docker-in-Docker

A common workaround for isolation is Docker-in-Docker (DinD): running a Docker daemon inside a container. The problem is that DinD typically requires privileged mode, which defeats the purpose of isolation. Docker Sandboxes solves this by running a full Docker Engine inside the microVM. The agent can use Docker normally, but the VM boundary prevents escape to the host. This is the only solution that lets agents build and run Docker containers from inside the sandbox without giving them access to the host daemon.

Four-Layer Isolation Model

Docker Sandboxes enforces isolation at four layers. Each layer addresses a different escape vector.

1. Hypervisor Isolation

Every sandbox runs inside a lightweight microVM with its own Linux kernel. No shared memory or processes with the host. A kernel exploit inside the sandbox compromises only that sandbox's kernel, not the host. This eliminates the entire class of container escape vulnerabilities that rely on shared kernel exploitation.

2. Network Isolation

Each sandbox has its own network namespace. Sandboxes cannot communicate with each other and cannot reach the host's localhost. All HTTP/HTTPS traffic is proxied through the host with a deny-by-default policy. Non-HTTP protocols are blocked entirely. You choose from three network policies: open, balanced (default deny with common dev sites allowed), or locked down.

3. Docker Engine Isolation

Each sandbox runs its own Docker Engine, completely separate from the host's Docker daemon. When the agent runs docker build or docker compose up, those commands execute against the sandbox's engine. The agent has no path to the host Docker daemon. This is the key differentiator: Docker-in-sandbox without the security risks of Docker-in-Docker.

4. Credential Isolation

API keys are never stored inside the sandbox. A host-side proxy intercepts outbound API requests and injects authentication headers before forwarding. The agent can call APIs that require credentials, but the credential values never enter the VM. If the sandbox is compromised, the attacker cannot extract API keys.

Credential Proxy vs Environment Variables

Most Docker setups pass API keys as environment variables, which means any process inside the container can read them. Docker Sandboxes' credential proxy is a meaningful improvement: the proxy on the host side injects auth headers into outbound requests, so the sandbox process never has access to the raw credential. Use sbx secret set for credentials that support proxy-based injection. Fall back to environment variables only when proxy injection is not available for a specific service.

Getting Started with sbx CLI

The sbx CLI is the primary interface for Docker Sandboxes. It does not require Docker Desktop. Download the binary from the sbx-releases repository or install through your package manager.

System Requirements

macOS with Apple silicon, or Windows 11 (x86_64) with Windows Hypervisor Platform enabled. Linux support is on the roadmap. The CLI handles microVM provisioning locally, so no cloud account or remote infrastructure is needed.

Install and authenticate

# Download sbx (macOS Apple silicon)
# Binary available at github.com/docker/sbx-releases

# Authenticate with Docker
sbx login
# Opens browser for Docker OAuth

# First login prompts for default network policy:
#   Open        - all network traffic allowed
#   Balanced    - default deny, common dev sites allowed
#   Locked Down - all traffic blocked unless explicitly allowed

Run an agent in a sandbox

# Launch Claude Code in a sandbox
sbx run claude

# Launch Gemini CLI
sbx run gemini

# Launch a generic shell
sbx run shell

# The first run pulls the agent image (takes longer).
# Subsequent runs reuse the cached image and start in seconds.

Manage sandboxes and credentials

# Create a sandbox without attaching
sbx create --agent claude --name my-project

# Open a shell in an existing sandbox
sbx exec my-project

# Set a secret (injected via credential proxy, not env var)
sbx secret set ANTHROPIC_API_KEY

# View active security policies
sbx policy ls

# List running sandboxes
sbx ls

# Stop and remove a sandbox
sbx rm my-project

Workspace Mounting

Your project directory is mounted into the sandbox through filesystem passthrough. Changes in either direction are instant, with no sync process or file copying. The agent sees your actual host files, so it can edit code, run tests, and commit changes. The host filesystem outside the mounted directory is not accessible from inside the sandbox.

Traditional Docker Isolation vs Docker Sandboxes

The security models are fundamentally different. Standard containers provide process-level isolation with a shared kernel. Docker Sandboxes provides VM-level isolation with dedicated kernels. Here is how they compare across the dimensions that matter for AI agent workloads.

Property	Standard Docker Container	Docker Sandbox (microVM)
Kernel	Shared with host and other containers	Dedicated per sandbox
Container escape risk	Kernel CVEs affect all containers on host	Kernel exploit isolated to single sandbox
Docker-in-Docker	Requires --privileged (defeats isolation)	Built-in isolated Docker Engine
Network isolation	Configurable but shares host network stack	Fully isolated, deny-by-default, HTTP proxy
Credential handling	Environment variables (readable by any process)	Host-side proxy injection (never enters VM)
Startup time	Sub-second	Seconds (first run), sub-second (cached)
Resource overhead	Minimal (shared kernel)	Higher (dedicated kernel per VM)
Use case	Trusted application workloads	Untrusted AI agent code execution

When Standard Containers Are Enough

If you control the code running inside the container, standard Docker provides adequate isolation for most workloads. Web servers, databases, CI pipelines running your own test suite: these are trusted workloads where kernel-level escape is not the primary threat model. The attack surface is manageable with good security practices: no privileged mode, minimal capabilities, no Docker socket mounting, regular kernel updates.

When You Need MicroVM Isolation

AI coding agents generate and execute arbitrary code. The code is not reviewed before execution. The agent may install unknown packages, run build scripts from untrusted repositories, or execute commands that interact with the filesystem and network in unexpected ways. This is the definition of untrusted code. MicroVM isolation ensures that even a worst-case scenario (kernel exploit, privilege escalation) is contained within a single sandbox with no path to the host or other sandboxes.

When You Need a Sandbox API Instead

Docker Sandboxes is a local CLI tool. It runs microVMs on your machine, supports interactive agent sessions, and mounts your workspace. This is the right model for a developer running Claude Code on their laptop.

It is not the right model for three common scenarios:

Building AI Products

If your application needs to create sandboxes for users programmatically, you need a cloud-hosted sandbox API with HTTP/SDK access, not a local CLI. Your backend calls the API to create a sandbox, send code, and retrieve results. Docker Sandboxes does not expose a programmatic API for this use case.

Scaling to Many Users

A sandbox API handles concurrency, resource allocation, and lifecycle management across thousands of simultaneous sandboxes. Running microVMs on a single developer machine does not scale to production workloads with many concurrent users.

No Local Infrastructure

Sandbox APIs run in the cloud. No local Docker, no hypervisor requirements, no binary downloads. Your CI system, your serverless functions, and your backend services can all create sandboxes with a single API call regardless of what operating system they run on.

Dimension	Docker Sandboxes (sbx)	Sandbox API (Morph)
Access model	Local CLI on developer machine	Cloud API (HTTP/SDK)
Use case	Developer running agents interactively	Application creating sandboxes programmatically
Concurrency	Limited by local machine resources	Managed cloud scaling
Cold start	Seconds (first), sub-second (cached)	< 300ms (pre-warmed)
Infrastructure required	Local hypervisor (macOS/Windows)	None (cloud-hosted)
Pricing	Free (experimental)	Included with Morph API plans
Docker Desktop required	No	No

Morph Sandbox SDK

The Morph Sandbox SDK provides programmatic sandbox access for applications that need to run untrusted code at scale. Sub-300ms cold starts from pre-warmed pools, session-scoped filesystem persistence, and WebSocket streaming. Included with Morph API plans at no additional cost.

Create a sandbox and run code

import { MorphSandbox } from "@anthropic-ai/morph-sandbox";

const sandbox = await MorphSandbox.create({
  apiKey: process.env.MORPH_API_KEY,
  template: "python-3.12",
  timeout: 300, // 5 minute max lifetime
});

// Execute untrusted code safely
const result = await sandbox.exec(`python3 -c "
import json
data = {'status': 'ok', 'agent': 'tested'}
print(json.dumps(data))
"`);

console.log(result.stdout);   // {"status": "ok", "agent": "tested"}
console.log(result.exitCode); // 0

await sandbox.destroy();

Multi-step agent workflow with persistent filesystem

const sandbox = await MorphSandbox.create({
  apiKey: process.env.MORPH_API_KEY,
  template: "node-20",
  timeout: 600,
});

// Step 1: Agent writes code
await sandbox.filesystem.write("/app/index.ts", agentGeneratedCode);
await sandbox.filesystem.write("/app/index.test.ts", agentGeneratedTests);

// Step 2: Install dependencies (filesystem persists between calls)
await sandbox.exec("cd /app && npm install");

// Step 3: Run tests, iterate on failures
let result = await sandbox.exec("cd /app && npx vitest run");
let retries = 0;

while (result.exitCode !== 0 && retries < 3) {
  const fixedCode = await llm.fixCode(agentGeneratedCode, result.stderr);
  await sandbox.filesystem.write("/app/index.ts", fixedCode);
  result = await sandbox.exec("cd /app && npx vitest run");
  retries++;
}

// Step 4: Pull artifacts
const coverage = await sandbox.filesystem.read("/app/coverage/lcov.info");

await sandbox.destroy();

Real-time output streaming

const sandbox = await MorphSandbox.create({
  apiKey: process.env.MORPH_API_KEY,
  template: "python-3.12",
});

await sandbox.filesystem.write("/app/build.py", buildScript);

const stream = sandbox.stream("cd /app && python build.py");

for await (const event of stream) {
  if (event.type === "stdout") {
    process.stdout.write(event.data);
  }
  if (event.type === "stderr" && event.data.includes("ERROR")) {
    stream.kill(); // Stop early on error
    break;
  }
}

await sandbox.destroy();

Docker Sandboxes + Morph: Not Either/Or

Docker Sandboxes and Morph Sandbox serve different layers of the stack. Use Docker Sandboxes when you are a developer running coding agents on your laptop. Use Morph Sandbox SDK when you are building an application that creates sandboxes programmatically for end users. A team might use Docker Sandboxes for local development and Morph Sandbox SDK for their production AI product.

Frequently Asked Questions

What is Docker Sandbox?

Docker Sandbox (officially "Docker Sandboxes") is a product from Docker that runs AI coding agents inside isolated microVMs. Each sandbox gets a dedicated Linux kernel, Docker daemon, filesystem, and network stack. It launched in March 2026 as an experimental feature and supports Claude Code, Gemini CLI, Codex, Copilot, Kiro, OpenCode, and Docker Agent out of the box.

Are Docker containers safe for running AI-generated code?

Standard Docker containers share the host kernel, which means kernel vulnerabilities can allow container escape. CVE-2022-0185, CVE-2019-5736, and CVE-2016-5195 all demonstrated practical container escape. For trusted code, containers provide reasonable isolation. For untrusted AI-generated code, you need stronger boundaries: Docker Sandboxes (microVM), gVisor (user-space kernel), or a managed sandbox API.

What is the difference between Docker containers and Docker Sandboxes?

Containers use namespaces and cgroups for process-level isolation but share the host kernel. Docker Sandboxes uses microVMs with dedicated kernels for VM-level isolation. Sandboxes also provide network isolation (no sandbox-to-sandbox or sandbox-to-host communication), a separate Docker Engine per sandbox (safe Docker-in-Docker), and credential proxy injection (API keys never enter the VM).

How do I install Docker Sandboxes?

Download the sbx CLI from the sbx-releases GitHub repository. Docker Desktop is not required. Requirements: macOS Apple silicon or Windows 11 x86_64 with Windows Hypervisor Platform enabled. Run sbx login to authenticate, then sbx run claude to launch your first sandboxed agent.

What agents does Docker Sandbox support?

Claude Code, Gemini CLI, GitHub Copilot CLI, OpenAI Codex, OpenCode, Kiro, Docker Agent, and a generic shell mode. The architecture supports adding new agents, and the open-source community has integrated it with autonomous systems like OpenClaw and NanoClaw.

Is Docker Sandbox free?

Docker Sandboxes is experimental and currently free. It does not require a Docker Desktop license. You still need API keys for the AI agents themselves (Anthropic, Google, OpenAI, etc.), which have their own pricing.

Can agents run Docker inside Docker Sandbox?

Yes. Each sandbox includes its own Docker Engine, so agents can run docker build, docker compose up, and other Docker commands inside the sandbox. The sandbox's Docker Engine is fully isolated from the host daemon. This is a key differentiator: Docker-in-sandbox without the security risks of Docker-in-Docker with --privileged.

What is the difference between Docker Sandbox and a sandbox API like Morph?

Docker Sandboxes is a local CLI for developers running agents interactively. Morph Sandbox SDK is a cloud API for applications creating sandboxes programmatically. Docker Sandboxes runs microVMs on your machine. Morph runs sandboxes in the cloud with sub-300ms cold starts, managed scaling, and HTTP/SDK access. Use Docker Sandboxes for local development. Use Morph Sandbox SDK for production AI products that need to run user code at scale.

Try Morph Sandbox SDK

Run untrusted code safely with sub-300ms cold starts, persistent filesystems, and WebSocket streaming. Included free with Morph API. Python and TypeScript SDKs.

Get API Key

Read the Docs

Morph Fast Apply

Morph WarpGrep

Morph Compact

Morph Glance

Morph MCP

Morph Monitor

Blog

Startup Credits

Students

Contact Us

About

Careers