Drive Agents Programmatically

Everything the Caged CLI and dashboard do goes through the public API — so you can script the entire loop: create a sandbox with an agent pre-installed, send it prompts, inspect the output, and tear it down. This is the foundation for building agent pipelines, batch jobs, and custom tooling.

Prerequisites

A Caged API key (caged_sk_...) — create one in the dashboard or with caged keys create
An ANTHROPIC_API_KEY for Claude Code (the agent talks to Anthropic with your key — Caged never proxies or resells tokens)

1. Create a Sandbox with an Agent

Create a Python sandbox that clones your repo, installs Claude Code, and caps spend at $5:

curl -X POST https://api.caged.dev/v1/sandboxes \
  -H "Authorization: Bearer $CAGED_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "template": "python-312",
    "memory_mb": 1024,
    "repo": "https://github.com/your-org/your-project",
    "agents": ["claude"],
    "budget": 5.00,
    "env": {"ANTHROPIC_API_KEY": "sk-ant-..."}
  }'

The create call returns once the repo is cloned and the agent is installed (allow up to a few minutes).

Sandboxes that install agents are automatically provisioned with at least 1024 MB of memory — agent installers need the headroom.

2. Prompt the Agent

Use the exec endpoint to run claude -p (print mode) with any prompt. The command runs inside the VM with the repo at /workspace:

curl -X POST https://api.caged.dev/v1/sandboxes/$SANDBOX_ID/exec \
  -H "Authorization: Bearer $CAGED_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"command": "cd /workspace && claude -p \"Summarize what this repo does in 3 bullets\""}'

Response

{
  "output": "- A Python library for parsing blockchain transactions...\n- ...\n- ...\n",
  "exit_code": 0
}

3. Let the Agent Make Changes

Claude Code can edit files, run tests, and commit — give it permission with --dangerously-skip-permissions (safe here: the blast radius is the disposable VM, which is exactly what Caged is for):

result = caged.sandboxes.exec(
    sandbox.id,
    'cd /workspace && claude -p --dangerously-skip-permissions '
    '"Fix the failing tests and run pytest to verify"',
    timeout=600,  # agent work can take a while
)
print(result.output)

Continue the same conversation across calls with -c:

followup = caged.sandboxes.exec(
    sandbox.id,
    'cd /workspace && claude -c -p "Now add a changelog entry for the fix"',
)

4. Inspect the Results

Read what the agent actually changed — without trusting its summary:

# Diff of all changes in the workspace
diff = caged.files.git_diff(sandbox.id)
print(diff)

# Read a specific file
content = caged.files.read(sandbox.id, "/workspace/CHANGELOG.md")

# Verify tests pass yourself
check = caged.sandboxes.exec(sandbox.id, "cd /workspace && python -m pytest -q")
print("tests pass" if check.ok else f"failed: {check.output}")

5. Handle Failures Properly

Exec distinguishes the command failed from the platform failed:

result = caged.sandboxes.exec(sandbox.id, "cd /workspace && python -m pytest -q")

if result.error:
    # Infrastructure problem: sandbox died, network issue, etc.
    raise RuntimeError(f"sandbox failure: {result.error}")
elif result.exit_code != 0:
    # Command ran and failed — stderr is in output
    print(f"tests failed (exit {result.exit_code}):\n{result.output}")
else:
    print(result.output)

6. Clean Up

# Snapshot first if you want to keep the agent's work
snapshot = caged.snapshots.create(sandbox.id, name="agent-run-1")

caged.sandboxes.destroy(sandbox.id)

Full Script

A complete, runnable batch job — point an agent at a repo, collect its answer, destroy the sandbox:

import os
from caged import Caged

caged = Caged(api_key=os.environ["CAGED_API_KEY"])

sandbox = caged.sandboxes.create(
    template="python-312",
    memory_mb=1024,
    repo="https://github.com/your-org/your-project",
    agents=["claude"],
    budget=5.00,
    env={"ANTHROPIC_API_KEY": os.environ["ANTHROPIC_API_KEY"]},
)

try:
    result = caged.sandboxes.exec(
        sandbox.id,
        'cd /workspace && claude -p "Review this codebase for security issues. '
        'List the top 3 with file and line references."',
        timeout=600,
    )
    if result.ok:
        print(result.output)
    else:
        print(f"agent failed (exit {result.exit_code}): {result.output or result.error}")
finally:
    caged.sandboxes.destroy(sandbox.id)

Interactive Sessions

exec is one-shot and non-interactive. For a live TTY (e.g. the full Claude Code TUI), connect to the terminal WebSocket the dashboard IDE uses:

wss://api.caged.dev/v1/sandboxes/{id}/terminal?rows=40&cols=120&token=caged_sk_...

Send raw keystrokes, receive raw terminal output — any WebSocket client works. Sessions stay open while idle and are recorded for replay.

​Drive Agents Programmatically

​Prerequisites

​1. Create a Sandbox with an Agent

​2. Prompt the Agent

​3. Let the Agent Make Changes

​4. Inspect the Results

​5. Handle Failures Properly

​6. Clean Up

​Full Script

​Interactive Sessions

​See Also

Drive Agents Programmatically

Prerequisites

1. Create a Sandbox with an Agent

2. Prompt the Agent

3. Let the Agent Make Changes

4. Inspect the Results

5. Handle Failures Properly

6. Clean Up

Full Script

Interactive Sessions

See Also