Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .changeset/lovely-friends-study.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
'@cloudflare/sandbox': patch
---

Add OpenAI Agents adapters

Add OpenAI Agents adapters (`Shell` and `Editor`) that integrate Cloudflare Sandbox with the OpenAI Agents SDK. These adapters enable AI agents to execute shell commands and perform file operations (create, update, delete) inside sandboxed environments. Both adapters automatically collect and timestamp results from operations, making it easy to track command execution and file modifications during agent sessions. The adapters are exported from `@cloudflare/sandbox/openai` and implement the OpenAI Agents `Shell` and `Editor` interfaces.
323 changes: 323 additions & 0 deletions docs/OPENAI_AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,323 @@
# OpenAI Agents Adapter

The Cloudflare Sandbox SDK provides adapters that integrate with the [OpenAI Agents SDK](https://github.com/openai/agents) to enable AI agents to execute shell commands and perform file operations inside sandboxed environments.

## Overview

The OpenAI Agents adapter consists of two main components:

- **`Shell`**: Implements the OpenAI Agents `Shell` interface, allowing agents to execute shell commands in the sandbox
- **`Editor`**: Implements the OpenAI Agents `Editor` interface, enabling agents to create, update, and delete files using patch operations

Both adapters automatically collect results from operations, making it easy to track what commands were executed and what files were modified during an agent session.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Critical: Security warning needs to be prominent at the top.

Add this immediately after the title:

> **⚠️ SECURITY WARNING**
> 
> These adapters enable AI agents to execute arbitrary shell commands and modify files. When using `needsApproval: false`, the AI has **unrestricted access** within the container.
> 
> **Production deployments MUST:**
> - Implement approval workflows for sensitive operations
> - Set up rate limiting to prevent abuse
> - Monitor and log all agent operations
> - Validate AI responses before executing operations

Developers scanning quickly will miss the security implications if it's only in the example README.


## Installation

The adapters are part of the `@cloudflare/sandbox` package:

```typescript
import { getSandbox } from '@cloudflare/sandbox';
import { Shell, Editor } from '@cloudflare/sandbox/openai';
import { Agent, applyPatchTool, run, shellTool } from '@openai/agents';
```

## Basic Usage

### Setting Up an Agent

```typescript
import { getSandbox } from '@cloudflare/sandbox';
import { Shell, Editor } from '@cloudflare/sandbox/openai';
import { Agent, applyPatchTool, run, shellTool } from '@openai/agents';

export default {
async fetch(request: Request, env: Env): Promise<Response> {
// Get a sandbox instance
const sandbox = getSandbox(env.Sandbox, 'workspace-session');

// Create shell adapter (executes commands in /workspace by default)
const shell = new Shell(sandbox);

// Create editor adapter (operates on /workspace by default)
const editor = new Editor(sandbox, '/workspace');

// Create an agent with both tools
const agent = new Agent({
name: 'Sandbox Assistant',
model: 'gpt-4',
instructions:
'You can execute shell commands and edit files in the workspace.',
tools: [
shellTool({ shell, needsApproval: false }),
applyPatchTool({ editor, needsApproval: false })
]
});

// Run the agent with user input
const { input } = await request.json();
const result = await run(agent, input);

// Access collected results
const commandResults = shell.results;
const fileOperations = editor.results;

return new Response(
JSON.stringify({
naturalResponse: result.finalOutput,
commandResults,
fileOperations
}),
{
headers: { 'Content-Type': 'application/json' }
}
);
}
};
```

## Shell Adapter

The `Shell` class adapts Cloudflare Sandbox `exec` calls to the OpenAI Agents `Shell` contract.

### Features

- Executes commands sequentially in the sandbox
- Preserves working directory (`/workspace` by default)
- Handles timeouts and errors gracefully
- Collects results with timestamps for each command
- Separates stdout and stderr output

### Command Results

Each executed command is automatically collected in `shell.results`:

```typescript
interface CommandResult {
command: string; // The command that was executed
stdout: string; // Standard output
stderr: string; // Standard error
exitCode: number | null; // Exit code (null for timeouts)
timestamp: number; // Unix timestamp in milliseconds
}
```

### Example: Inspecting Workspace

```typescript
const shell = new Shell(sandbox);

// Agent can execute commands like:
// - ls -la
// - cat package.json
// - git status
// - npm install

// After agent execution, access results:
shell.results.forEach((result) => {
console.log(`Command: ${result.command}`);
console.log(`Exit code: ${result.exitCode}`);
console.log(`Output: ${result.stdout}`);
});
```

### Error Handling

The Shell adapter handles various error scenarios:

- **Command failures**: Non-zero exit codes are captured in `exitCode`
- **Timeouts**: Commands that exceed the timeout return `exitCode: null` and `outcome.type: 'timeout'`
- **Network errors**: HTTP/network errors are caught and logged

## Editor Adapter

The `Editor` class implements file operations using the OpenAI Agents patch-based editing system.

### Features

- Creates files with initial content using diffs
- Updates existing files by applying diffs
- Deletes files
- Automatically creates parent directories when needed
- Validates paths to prevent operations outside the workspace
- Collects results with timestamps for each operation

### File Operation Results

Each file operation is automatically collected in `editor.results`:

```typescript
interface FileOperationResult {
operation: 'create' | 'update' | 'delete';
path: string; // Relative path from workspace root
status: 'completed' | 'failed';
output: string; // Human-readable status message
error?: string; // Error message if status is 'failed'
timestamp: number; // Unix timestamp in milliseconds
}
```

### Path Resolution

The Editor enforces security by:

- Resolving relative paths within the workspace root (`/workspace` by default)
- Preventing path traversal attacks (e.g., `../../../etc/passwd`)
- Normalizing path separators and removing redundant segments
- Throwing errors for operations outside the workspace

### Example: Creating and Editing Files

```typescript
const editor = new Editor(sandbox, '/workspace');

// Agent can use apply_patch tool to:
// - Create new files with content
// - Update existing files with diffs
// - Delete files

// After agent execution, access results:
editor.results.forEach((result) => {
console.log(`${result.operation}: ${result.path}`);
console.log(`Status: ${result.status}`);
if (result.error) {
console.log(`Error: ${result.error}`);
}
});
```

### Custom Workspace Root

You can specify a custom workspace root:

```typescript
// Use a different root directory
const editor = new Editor(sandbox, '/custom/workspace');
```

## Complete Example

Here's a complete example showing how to integrate the adapters in a Cloudflare Worker:

```typescript
import { getSandbox } from '@cloudflare/sandbox';
import { Shell, Editor } from '@cloudflare/sandbox/openai';
import { Agent, applyPatchTool, run, shellTool } from '@openai/agents';

async function handleRunRequest(request: Request, env: Env): Promise<Response> {
try {
const { input } = await request.json();

if (!input || typeof input !== 'string') {
return new Response(
JSON.stringify({ error: 'Missing or invalid input field' }),
{ status: 400, headers: { 'Content-Type': 'application/json' } }
);
}

// Get sandbox instance (reused for both shell and editor)
const sandbox = getSandbox(env.Sandbox, 'workspace-session');

// Create adapters
const shell = new Shell(sandbox);
const editor = new Editor(sandbox, '/workspace');

// Create agent with tools
const agent = new Agent({
name: 'Sandbox Studio',
model: 'gpt-4',
instructions: `
You can execute shell commands and edit files in the workspace.
Use shell commands to inspect the repository and the apply_patch tool
to create, update, or delete files. Keep responses concise and include
command output when helpful.
`,
tools: [
shellTool({ shell, needsApproval: false }),
applyPatchTool({ editor, needsApproval: false })
]
});

// Run the agent
const result = await run(agent, input);

// Format response with sorted results
const response = {
naturalResponse: result.finalOutput || null,
commandResults: shell.results.sort((a, b) => a.timestamp - b.timestamp),
fileOperations: editor.results.sort((a, b) => a.timestamp - b.timestamp)
};

return new Response(JSON.stringify(response), {
headers: { 'Content-Type': 'application/json' }
});
} catch (error) {
return new Response(
JSON.stringify({
error: error instanceof Error ? error.message : 'Internal server error',
naturalResponse: 'An error occurred while processing your request.',
commandResults: [],
fileOperations: []
}),
{
status: 500,
headers: { 'Content-Type': 'application/json' }
}
);
}
}

export default {
async fetch(request: Request, env: Env): Promise<Response> {
const url = new URL(request.url);

if (url.pathname === '/run' && request.method === 'POST') {
return handleRunRequest(request, env);
}

return new Response('Not found', { status: 404 });
}
};
```

## Result Tracking

Both adapters automatically track all operations with timestamps. This makes it easy to:

- **Audit operations**: See exactly what commands were run and files were modified
- **Debug issues**: Identify which operation failed and when
- **Build UIs**: Display a timeline of agent actions
- **Logging**: Export operation history for analysis

### Combining Results

You can combine and sort results from both adapters:

```typescript
const allResults = [
...shell.results.map((r) => ({ type: 'command' as const, ...r })),
...editor.results.map((r) => ({ type: 'file' as const, ...r }))
].sort((a, b) => a.timestamp - b.timestamp);

// allResults is now a chronological list of all operations
```

## Best Practices

1. **Reuse sandbox instances**: Create one sandbox instance and share it between Shell and Editor
2. **Set appropriate timeouts**: Configure command timeouts based on expected operation duration
3. **Handle errors gracefully**: Check `status` fields in results and handle `failed` operations
4. **Validate paths**: The Editor already validates paths, but be aware of workspace boundaries
5. **Monitor resource usage**: Large command outputs or file operations may impact performance

## Limitations

- **Working directory**: Shell operations always execute in `/workspace` (or the configured root)
- **Path restrictions**: File operations are restricted to the workspace root
- **Sequential execution**: Commands execute sequentially, not in parallel
- **Timeout handling**: Timeouts stop further command execution in a batch

## See Also

- [OpenAI Agents SDK Documentation](https://github.com/openai/openai-agents-js/)
- [Session Execution Architecture](./SESSION_EXECUTION.md) - Understanding how commands execute in sandboxes
- [Example Implementation](../examples/openai-agents/src/index.ts) - Full working example
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important: Missing Cloudflare deployment guidance.

Add section covering:

  • Durable Objects and session persistence
  • Rate limiting recommendations
  • Cost considerations (container + OpenAI API)
  • wrangler.jsonc configuration
  • Using secrets for API keys

Example structure:

## Deploying to Production

### Cloudflare-Specific Considerations

1. **Durable Objects:** Sessions persist across requests
2. **Rate Limiting:** Use Workers rate limiting or KV
3. **Costs:** Container CPU time + OpenAI tokens + DO requests
4. **Security:** Use Cloudflare Access + audit logs in KV

### Required Configuration

Your `wrangler.jsonc`:
```jsonc
{
  "durable_objects": {
    "bindings": [
      { "name": "Sandbox", "class_name": "Sandbox" }
    ]
  }
}

For production, use secrets:

wrangler secret put OPENAI_API_KEY

13 changes: 13 additions & 0 deletions examples/openai-agents/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# This image is unique to this repo, and you'll never need it.
# Whenever you're integrating with sandbox SDK in your own project,
# you should use the official image instead:
# FROM docker.io/cloudflare/sandbox:0.5.0
# FROM cloudflare/sandbox-test:0.5.0

# On a mac, you might need to actively pick up the
# arm64 build of the image.
FROM --platform=linux/arm64 cloudflare/sandbox-test:0.5.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded --platform=linux/arm64 breaks on non-ARM64 systems (Intel Macs, Linux x64). The official images support both platforms, so remove the platform flag:

FROM cloudflare/sandbox-test:0.5.0


# Required during local development to access exposed ports
EXPOSE 8080
EXPOSE 3000
42 changes: 42 additions & 0 deletions examples/openai-agents/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# OpenAI Agents with Cloudflare Sandbox

A conversational AI assistant that executes shell commands and edits files in a Cloudflare Sandbox.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README needs prominent security warning before anyone deploys this. The auto-approval of all AI operations is a significant risk.

Suggest adding:

## Security Warning

**This example auto-approves all AI operations without human review.** The AI can:
- Execute ANY shell command
- Create, modify, or delete ANY file in /workspace
- No safety limits beyond the container itself

**Do not use in production without proper approval flows and rate limiting.**


## Setup

Create a `.env` file with your OpenAI API key:

```
OPENAI_API_KEY=your-api-key-here
```

Then start the development server:

```bash
npm start
```

## Usage

Enter natural language commands in the chat interface. The assistant can:

- Execute shell commands
- Create, edit, and delete files

All conversations are saved in your browser's localStorage.

## Deploy

```bash
npm run deploy
```

## Security Warning

**This example auto-approves all AI operations without human review.** The AI can:

- Execute ANY shell command
- Create, modify, or delete ANY file in /workspace
- No safety limits beyond the container itself

**Do not use in production without proper approval flows and rate limiting.**
Loading
Loading