Why Docker Containers Are Not Sandboxes
Docker containers isolate processes using Linux namespaces and cgroups. They control what a process can see (its own PID namespace, mount namespace, network namespace) and how much it can consume (CPU shares, memory limits). This is sufficient for running trusted application code on shared infrastructure. It is not sufficient for running untrusted code from an AI agent.
The core problem: every container on a host shares the same Linux kernel. A vulnerability in the kernel is a vulnerability in every container on that machine. This is not a theoretical concern. CVE-2022-0185 allowed container escape through a heap overflow in the filesystem context code. CVE-2019-5736 allowed overwriting the host runc binary from inside a container. CVE-2016-5195 (Dirty COW) allowed privilege escalation from any container to root on the host.
Misconfiguration compounds the risk. Containers running with --privileged, with CAP_SYS_ADMIN, or with the Docker socket mounted (/var/run/docker.sock) give attackers a direct path to the host. AI agents that install packages, run build tools, and execute arbitrary code are exactly the workload where these risks matter most.
Container Escape Is Not Exotic
Palo Alto's Unit 42 research team documented that standard containers "are not truly sandboxed as they share the host OS kernel." CyberArk demonstrated kernel exploitation paths from container to host root. These are not edge cases. Any workload running untrusted code in a standard container is one kernel CVE away from full host compromise.
Docker Sandboxes: The MicroVM Approach
Docker Sandboxes, released in March 2026, takes a different approach. Instead of running agents inside containers that share the host kernel, it runs each agent in a lightweight microVM with its own dedicated Linux kernel. The product ships as a standalone CLI called sbx and does not require Docker Desktop.
Each sandbox is an isolated environment: its own kernel, its own Docker daemon, its own filesystem, its own network stack. An agent running inside a sandbox can build Docker images, install system packages, modify files, and run any command it wants. None of this touches the host. If the agent does something destructive, the sandbox is destroyed and the host is unaffected.
The supported agents at launch: Claude Code, Gemini CLI, GitHub Copilot CLI, OpenAI Codex, OpenCode, Kiro, Docker Agent, and a generic shell mode. Docker Sandboxes also works with autonomous systems like OpenClaw and NanoClaw.
What Makes It Different from Docker-in-Docker
A common workaround for isolation is Docker-in-Docker (DinD): running a Docker daemon inside a container. The problem is that DinD typically requires privileged mode, which defeats the purpose of isolation. Docker Sandboxes solves this by running a full Docker Engine inside the microVM. The agent can use Docker normally, but the VM boundary prevents escape to the host. This is the only solution that lets agents build and run Docker containers from inside the sandbox without giving them access to the host daemon.
Four-Layer Isolation Model
Docker Sandboxes enforces isolation at four layers. Each layer addresses a different escape vector.
1. Hypervisor Isolation
Every sandbox runs inside a lightweight microVM with its own Linux kernel. No shared memory or processes with the host. A kernel exploit inside the sandbox compromises only that sandbox's kernel, not the host. This eliminates the entire class of container escape vulnerabilities that rely on shared kernel exploitation.
2. Network Isolation
Each sandbox has its own network namespace. Sandboxes cannot communicate with each other and cannot reach the host's localhost. All HTTP/HTTPS traffic is proxied through the host with a deny-by-default policy. Non-HTTP protocols are blocked entirely. You choose from three network policies: open, balanced (default deny with common dev sites allowed), or locked down.
3. Docker Engine Isolation
Each sandbox runs its own Docker Engine, completely separate from the host's Docker daemon. When the agent runs docker build or docker compose up, those commands execute against the sandbox's engine. The agent has no path to the host Docker daemon. This is the key differentiator: Docker-in-sandbox without the security risks of Docker-in-Docker.
4. Credential Isolation
API keys are never stored inside the sandbox. A host-side proxy intercepts outbound API requests and injects authentication headers before forwarding. The agent can call APIs that require credentials, but the credential values never enter the VM. If the sandbox is compromised, the attacker cannot extract API keys.
Credential Proxy vs Environment Variables
Most Docker setups pass API keys as environment variables, which means any process inside the container can read them. Docker Sandboxes' credential proxy is a meaningful improvement: the proxy on the host side injects auth headers into outbound requests, so the sandbox process never has access to the raw credential. Use sbx secret set for credentials that support proxy-based injection. Fall back to environment variables only when proxy injection is not available for a specific service.
Getting Started with sbx CLI
The sbx CLI is the primary interface for Docker Sandboxes. It does not require Docker Desktop. Download the binary from the sbx-releases repository or install through your package manager.
System Requirements
macOS with Apple silicon, or Windows 11 (x86_64) with Windows Hypervisor Platform enabled. Linux support is on the roadmap. The CLI handles microVM provisioning locally, so no cloud account or remote infrastructure is needed.
Install and authenticate
# Download sbx (macOS Apple silicon)
# Binary available at github.com/docker/sbx-releases
# Authenticate with Docker
sbx login
# Opens browser for Docker OAuth
# First login prompts for default network policy:
# Open - all network traffic allowed
# Balanced - default deny, common dev sites allowed
# Locked Down - all traffic blocked unless explicitly allowedRun an agent in a sandbox
# Launch Claude Code in a sandbox
sbx run claude
# Launch Gemini CLI
sbx run gemini
# Launch a generic shell
sbx run shell
# The first run pulls the agent image (takes longer).
# Subsequent runs reuse the cached image and start in seconds.Manage sandboxes and credentials
# Create a sandbox without attaching
sbx create --agent claude --name my-project
# Open a shell in an existing sandbox
sbx exec my-project
# Set a secret (injected via credential proxy, not env var)
sbx secret set ANTHROPIC_API_KEY
# View active security policies
sbx policy ls
# List running sandboxes
sbx ls
# Stop and remove a sandbox
sbx rm my-projectWorkspace Mounting
Your project directory is mounted into the sandbox through filesystem passthrough. Changes in either direction are instant, with no sync process or file copying. The agent sees your actual host files, so it can edit code, run tests, and commit changes. The host filesystem outside the mounted directory is not accessible from inside the sandbox.
Traditional Docker Isolation vs Docker Sandboxes
The security models are fundamentally different. Standard containers provide process-level isolation with a shared kernel. Docker Sandboxes provides VM-level isolation with dedicated kernels. Here is how they compare across the dimensions that matter for AI agent workloads.
| Property | Standard Docker Container | Docker Sandbox (microVM) |
|---|---|---|
| Kernel | Shared with host and other containers | Dedicated per sandbox |
| Container escape risk | Kernel CVEs affect all containers on host | Kernel exploit isolated to single sandbox |
| Docker-in-Docker | Requires --privileged (defeats isolation) | Built-in isolated Docker Engine |
| Network isolation | Configurable but shares host network stack | Fully isolated, deny-by-default, HTTP proxy |
| Credential handling | Environment variables (readable by any process) | Host-side proxy injection (never enters VM) |
| Startup time | Sub-second | Seconds (first run), sub-second (cached) |
| Resource overhead | Minimal (shared kernel) | Higher (dedicated kernel per VM) |
| Use case | Trusted application workloads | Untrusted AI agent code execution |
When Standard Containers Are Enough
If you control the code running inside the container, standard Docker provides adequate isolation for most workloads. Web servers, databases, CI pipelines running your own test suite: these are trusted workloads where kernel-level escape is not the primary threat model. The attack surface is manageable with good security practices: no privileged mode, minimal capabilities, no Docker socket mounting, regular kernel updates.
When You Need MicroVM Isolation
AI coding agents generate and execute arbitrary code. The code is not reviewed before execution. The agent may install unknown packages, run build scripts from untrusted repositories, or execute commands that interact with the filesystem and network in unexpected ways. This is the definition of untrusted code. MicroVM isolation ensures that even a worst-case scenario (kernel exploit, privilege escalation) is contained within a single sandbox with no path to the host or other sandboxes.
When You Need a Sandbox API Instead
Docker Sandboxes is a local CLI tool. It runs microVMs on your machine, supports interactive agent sessions, and mounts your workspace. This is the right model for a developer running Claude Code on their laptop.
It is not the right model for three common scenarios:
Building AI Products
If your application needs to create sandboxes for users programmatically, you need a cloud-hosted sandbox API with HTTP/SDK access, not a local CLI. Your backend calls the API to create a sandbox, send code, and retrieve results. Docker Sandboxes does not expose a programmatic API for this use case.
Scaling to Many Users
A sandbox API handles concurrency, resource allocation, and lifecycle management across thousands of simultaneous sandboxes. Running microVMs on a single developer machine does not scale to production workloads with many concurrent users.
No Local Infrastructure
Sandbox APIs run in the cloud. No local Docker, no hypervisor requirements, no binary downloads. Your CI system, your serverless functions, and your backend services can all create sandboxes with a single API call regardless of what operating system they run on.
| Dimension | Docker Sandboxes (sbx) | Sandbox API (Morph) |
|---|---|---|
| Access model | Local CLI on developer machine | Cloud API (HTTP/SDK) |
| Use case | Developer running agents interactively | Application creating sandboxes programmatically |
| Concurrency | Limited by local machine resources | Managed cloud scaling |
| Cold start | Seconds (first), sub-second (cached) | < 300ms (pre-warmed) |
| Infrastructure required | Local hypervisor (macOS/Windows) | None (cloud-hosted) |
| Pricing | Free (experimental) | Included with Morph API plans |
| Docker Desktop required | No | No |
Morph Sandbox SDK
The Morph Sandbox SDK provides programmatic sandbox access for applications that need to run untrusted code at scale. Sub-300ms cold starts from pre-warmed pools, session-scoped filesystem persistence, and WebSocket streaming. Included with Morph API plans at no additional cost.
Create a sandbox and run code
import { MorphSandbox } from "@anthropic-ai/morph-sandbox";
const sandbox = await MorphSandbox.create({
apiKey: process.env.MORPH_API_KEY,
template: "python-3.12",
timeout: 300, // 5 minute max lifetime
});
// Execute untrusted code safely
const result = await sandbox.exec(`python3 -c "
import json
data = {'status': 'ok', 'agent': 'tested'}
print(json.dumps(data))
"`);
console.log(result.stdout); // {"status": "ok", "agent": "tested"}
console.log(result.exitCode); // 0
await sandbox.destroy();Multi-step agent workflow with persistent filesystem
const sandbox = await MorphSandbox.create({
apiKey: process.env.MORPH_API_KEY,
template: "node-20",
timeout: 600,
});
// Step 1: Agent writes code
await sandbox.filesystem.write("/app/index.ts", agentGeneratedCode);
await sandbox.filesystem.write("/app/index.test.ts", agentGeneratedTests);
// Step 2: Install dependencies (filesystem persists between calls)
await sandbox.exec("cd /app && npm install");
// Step 3: Run tests, iterate on failures
let result = await sandbox.exec("cd /app && npx vitest run");
let retries = 0;
while (result.exitCode !== 0 && retries < 3) {
const fixedCode = await llm.fixCode(agentGeneratedCode, result.stderr);
await sandbox.filesystem.write("/app/index.ts", fixedCode);
result = await sandbox.exec("cd /app && npx vitest run");
retries++;
}
// Step 4: Pull artifacts
const coverage = await sandbox.filesystem.read("/app/coverage/lcov.info");
await sandbox.destroy();Real-time output streaming
const sandbox = await MorphSandbox.create({
apiKey: process.env.MORPH_API_KEY,
template: "python-3.12",
});
await sandbox.filesystem.write("/app/build.py", buildScript);
const stream = sandbox.stream("cd /app && python build.py");
for await (const event of stream) {
if (event.type === "stdout") {
process.stdout.write(event.data);
}
if (event.type === "stderr" && event.data.includes("ERROR")) {
stream.kill(); // Stop early on error
break;
}
}
await sandbox.destroy();Docker Sandboxes + Morph: Not Either/Or
Docker Sandboxes and Morph Sandbox serve different layers of the stack. Use Docker Sandboxes when you are a developer running coding agents on your laptop. Use Morph Sandbox SDK when you are building an application that creates sandboxes programmatically for end users. A team might use Docker Sandboxes for local development and Morph Sandbox SDK for their production AI product.
Frequently Asked Questions
What is Docker Sandbox?
Docker Sandbox (officially "Docker Sandboxes") is a product from Docker that runs AI coding agents inside isolated microVMs. Each sandbox gets a dedicated Linux kernel, Docker daemon, filesystem, and network stack. It launched in March 2026 as an experimental feature and supports Claude Code, Gemini CLI, Codex, Copilot, Kiro, OpenCode, and Docker Agent out of the box.
Are Docker containers safe for running AI-generated code?
Standard Docker containers share the host kernel, which means kernel vulnerabilities can allow container escape. CVE-2022-0185, CVE-2019-5736, and CVE-2016-5195 all demonstrated practical container escape. For trusted code, containers provide reasonable isolation. For untrusted AI-generated code, you need stronger boundaries: Docker Sandboxes (microVM), gVisor (user-space kernel), or a managed sandbox API.
What is the difference between Docker containers and Docker Sandboxes?
Containers use namespaces and cgroups for process-level isolation but share the host kernel. Docker Sandboxes uses microVMs with dedicated kernels for VM-level isolation. Sandboxes also provide network isolation (no sandbox-to-sandbox or sandbox-to-host communication), a separate Docker Engine per sandbox (safe Docker-in-Docker), and credential proxy injection (API keys never enter the VM).
How do I install Docker Sandboxes?
Download the sbx CLI from the sbx-releases GitHub repository. Docker Desktop is not required. Requirements: macOS Apple silicon or Windows 11 x86_64 with Windows Hypervisor Platform enabled. Run sbx login to authenticate, then sbx run claude to launch your first sandboxed agent.
What agents does Docker Sandbox support?
Claude Code, Gemini CLI, GitHub Copilot CLI, OpenAI Codex, OpenCode, Kiro, Docker Agent, and a generic shell mode. The architecture supports adding new agents, and the open-source community has integrated it with autonomous systems like OpenClaw and NanoClaw.
Is Docker Sandbox free?
Docker Sandboxes is experimental and currently free. It does not require a Docker Desktop license. You still need API keys for the AI agents themselves (Anthropic, Google, OpenAI, etc.), which have their own pricing.
Can agents run Docker inside Docker Sandbox?
Yes. Each sandbox includes its own Docker Engine, so agents can run docker build, docker compose up, and other Docker commands inside the sandbox. The sandbox's Docker Engine is fully isolated from the host daemon. This is a key differentiator: Docker-in-sandbox without the security risks of Docker-in-Docker with --privileged.
What is the difference between Docker Sandbox and a sandbox API like Morph?
Docker Sandboxes is a local CLI for developers running agents interactively. Morph Sandbox SDK is a cloud API for applications creating sandboxes programmatically. Docker Sandboxes runs microVMs on your machine. Morph runs sandboxes in the cloud with sub-300ms cold starts, managed scaling, and HTTP/SDK access. Use Docker Sandboxes for local development. Use Morph Sandbox SDK for production AI products that need to run user code at scale.
Try Morph Sandbox SDK
Run untrusted code safely with sub-300ms cold starts, persistent filesystems, and WebSocket streaming. Included free with Morph API. Python and TypeScript SDKs.