-
Notifications
You must be signed in to change notification settings - Fork 36
openai agents shell/apply_patch example #221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
6208f2a
1a83e0f
861c18a
00f87b9
f9cdc62
02183ec
9453aed
d6b934d
126dccd
b269f52
5df7c3f
5b07e5b
f2a9fc8
ecaa077
0144558
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| --- | ||
| '@cloudflare/sandbox': patch | ||
| --- | ||
|
|
||
| Add OpenAI Agents adapters | ||
|
|
||
| Add OpenAI Agents adapters (`Shell` and `Editor`) that integrate Cloudflare Sandbox with the OpenAI Agents SDK. These adapters enable AI agents to execute shell commands and perform file operations (create, update, delete) inside sandboxed environments. Both adapters automatically collect and timestamp results from operations, making it easy to track command execution and file modifications during agent sessions. The adapters are exported from `@cloudflare/sandbox/openai` and implement the OpenAI Agents `Shell` and `Editor` interfaces. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,323 @@ | ||
| # OpenAI Agents Adapter | ||
|
|
||
| The Cloudflare Sandbox SDK provides adapters that integrate with the [OpenAI Agents SDK](https://github.com/openai/agents) to enable AI agents to execute shell commands and perform file operations inside sandboxed environments. | ||
|
|
||
| ## Overview | ||
|
|
||
| The OpenAI Agents adapter consists of two main components: | ||
|
|
||
| - **`Shell`**: Implements the OpenAI Agents `Shell` interface, allowing agents to execute shell commands in the sandbox | ||
| - **`Editor`**: Implements the OpenAI Agents `Editor` interface, enabling agents to create, update, and delete files using patch operations | ||
|
|
||
| Both adapters automatically collect results from operations, making it easy to track what commands were executed and what files were modified during an agent session. | ||
|
|
||
| ## Installation | ||
|
|
||
| The adapters are part of the `@cloudflare/sandbox` package: | ||
|
|
||
| ```typescript | ||
| import { getSandbox } from '@cloudflare/sandbox'; | ||
| import { Shell, Editor } from '@cloudflare/sandbox/openai'; | ||
| import { Agent, applyPatchTool, run, shellTool } from '@openai/agents'; | ||
| ``` | ||
|
|
||
| ## Basic Usage | ||
|
|
||
| ### Setting Up an Agent | ||
|
|
||
| ```typescript | ||
| import { getSandbox } from '@cloudflare/sandbox'; | ||
| import { Shell, Editor } from '@cloudflare/sandbox/openai'; | ||
| import { Agent, applyPatchTool, run, shellTool } from '@openai/agents'; | ||
|
|
||
| export default { | ||
| async fetch(request: Request, env: Env): Promise<Response> { | ||
| // Get a sandbox instance | ||
| const sandbox = getSandbox(env.Sandbox, 'workspace-session'); | ||
|
|
||
| // Create shell adapter (executes commands in /workspace by default) | ||
| const shell = new Shell(sandbox); | ||
|
|
||
| // Create editor adapter (operates on /workspace by default) | ||
| const editor = new Editor(sandbox, '/workspace'); | ||
|
|
||
| // Create an agent with both tools | ||
| const agent = new Agent({ | ||
| name: 'Sandbox Assistant', | ||
| model: 'gpt-4', | ||
| instructions: | ||
| 'You can execute shell commands and edit files in the workspace.', | ||
| tools: [ | ||
| shellTool({ shell, needsApproval: false }), | ||
| applyPatchTool({ editor, needsApproval: false }) | ||
| ] | ||
| }); | ||
|
|
||
| // Run the agent with user input | ||
| const { input } = await request.json(); | ||
| const result = await run(agent, input); | ||
|
|
||
| // Access collected results | ||
| const commandResults = shell.results; | ||
| const fileOperations = editor.results; | ||
|
|
||
| return new Response( | ||
| JSON.stringify({ | ||
| naturalResponse: result.finalOutput, | ||
| commandResults, | ||
| fileOperations | ||
| }), | ||
| { | ||
| headers: { 'Content-Type': 'application/json' } | ||
| } | ||
| ); | ||
| } | ||
| }; | ||
| ``` | ||
|
|
||
| ## Shell Adapter | ||
|
|
||
| The `Shell` class adapts Cloudflare Sandbox `exec` calls to the OpenAI Agents `Shell` contract. | ||
|
|
||
| ### Features | ||
|
|
||
| - Executes commands sequentially in the sandbox | ||
| - Preserves working directory (`/workspace` by default) | ||
| - Handles timeouts and errors gracefully | ||
| - Collects results with timestamps for each command | ||
| - Separates stdout and stderr output | ||
|
|
||
| ### Command Results | ||
|
|
||
| Each executed command is automatically collected in `shell.results`: | ||
|
|
||
| ```typescript | ||
| interface CommandResult { | ||
| command: string; // The command that was executed | ||
| stdout: string; // Standard output | ||
| stderr: string; // Standard error | ||
| exitCode: number | null; // Exit code (null for timeouts) | ||
| timestamp: number; // Unix timestamp in milliseconds | ||
| } | ||
| ``` | ||
|
|
||
| ### Example: Inspecting Workspace | ||
|
|
||
| ```typescript | ||
| const shell = new Shell(sandbox); | ||
|
|
||
| // Agent can execute commands like: | ||
| // - ls -la | ||
| // - cat package.json | ||
| // - git status | ||
| // - npm install | ||
|
|
||
| // After agent execution, access results: | ||
| shell.results.forEach((result) => { | ||
| console.log(`Command: ${result.command}`); | ||
| console.log(`Exit code: ${result.exitCode}`); | ||
| console.log(`Output: ${result.stdout}`); | ||
| }); | ||
| ``` | ||
|
|
||
| ### Error Handling | ||
|
|
||
| The Shell adapter handles various error scenarios: | ||
|
|
||
| - **Command failures**: Non-zero exit codes are captured in `exitCode` | ||
| - **Timeouts**: Commands that exceed the timeout return `exitCode: null` and `outcome.type: 'timeout'` | ||
| - **Network errors**: HTTP/network errors are caught and logged | ||
|
|
||
| ## Editor Adapter | ||
|
|
||
| The `Editor` class implements file operations using the OpenAI Agents patch-based editing system. | ||
|
|
||
| ### Features | ||
|
|
||
| - Creates files with initial content using diffs | ||
| - Updates existing files by applying diffs | ||
| - Deletes files | ||
| - Automatically creates parent directories when needed | ||
| - Validates paths to prevent operations outside the workspace | ||
| - Collects results with timestamps for each operation | ||
|
|
||
| ### File Operation Results | ||
|
|
||
| Each file operation is automatically collected in `editor.results`: | ||
|
|
||
| ```typescript | ||
| interface FileOperationResult { | ||
| operation: 'create' | 'update' | 'delete'; | ||
| path: string; // Relative path from workspace root | ||
| status: 'completed' | 'failed'; | ||
| output: string; // Human-readable status message | ||
| error?: string; // Error message if status is 'failed' | ||
| timestamp: number; // Unix timestamp in milliseconds | ||
| } | ||
| ``` | ||
|
|
||
| ### Path Resolution | ||
|
|
||
| The Editor enforces security by: | ||
|
|
||
| - Resolving relative paths within the workspace root (`/workspace` by default) | ||
| - Preventing path traversal attacks (e.g., `../../../etc/passwd`) | ||
| - Normalizing path separators and removing redundant segments | ||
| - Throwing errors for operations outside the workspace | ||
|
|
||
| ### Example: Creating and Editing Files | ||
|
|
||
| ```typescript | ||
| const editor = new Editor(sandbox, '/workspace'); | ||
|
|
||
| // Agent can use apply_patch tool to: | ||
| // - Create new files with content | ||
| // - Update existing files with diffs | ||
| // - Delete files | ||
|
|
||
| // After agent execution, access results: | ||
| editor.results.forEach((result) => { | ||
| console.log(`${result.operation}: ${result.path}`); | ||
| console.log(`Status: ${result.status}`); | ||
| if (result.error) { | ||
| console.log(`Error: ${result.error}`); | ||
| } | ||
| }); | ||
| ``` | ||
|
|
||
| ### Custom Workspace Root | ||
|
|
||
| You can specify a custom workspace root: | ||
|
|
||
| ```typescript | ||
| // Use a different root directory | ||
| const editor = new Editor(sandbox, '/custom/workspace'); | ||
| ``` | ||
|
|
||
| ## Complete Example | ||
|
|
||
| Here's a complete example showing how to integrate the adapters in a Cloudflare Worker: | ||
|
|
||
| ```typescript | ||
| import { getSandbox } from '@cloudflare/sandbox'; | ||
| import { Shell, Editor } from '@cloudflare/sandbox/openai'; | ||
| import { Agent, applyPatchTool, run, shellTool } from '@openai/agents'; | ||
|
|
||
| async function handleRunRequest(request: Request, env: Env): Promise<Response> { | ||
| try { | ||
| const { input } = await request.json(); | ||
|
|
||
| if (!input || typeof input !== 'string') { | ||
| return new Response( | ||
| JSON.stringify({ error: 'Missing or invalid input field' }), | ||
| { status: 400, headers: { 'Content-Type': 'application/json' } } | ||
| ); | ||
| } | ||
|
|
||
| // Get sandbox instance (reused for both shell and editor) | ||
| const sandbox = getSandbox(env.Sandbox, 'workspace-session'); | ||
|
|
||
| // Create adapters | ||
| const shell = new Shell(sandbox); | ||
| const editor = new Editor(sandbox, '/workspace'); | ||
|
|
||
| // Create agent with tools | ||
| const agent = new Agent({ | ||
| name: 'Sandbox Studio', | ||
| model: 'gpt-4', | ||
| instructions: ` | ||
| You can execute shell commands and edit files in the workspace. | ||
| Use shell commands to inspect the repository and the apply_patch tool | ||
| to create, update, or delete files. Keep responses concise and include | ||
| command output when helpful. | ||
| `, | ||
| tools: [ | ||
| shellTool({ shell, needsApproval: false }), | ||
| applyPatchTool({ editor, needsApproval: false }) | ||
| ] | ||
| }); | ||
|
|
||
| // Run the agent | ||
| const result = await run(agent, input); | ||
|
|
||
| // Format response with sorted results | ||
| const response = { | ||
| naturalResponse: result.finalOutput || null, | ||
| commandResults: shell.results.sort((a, b) => a.timestamp - b.timestamp), | ||
| fileOperations: editor.results.sort((a, b) => a.timestamp - b.timestamp) | ||
| }; | ||
|
|
||
| return new Response(JSON.stringify(response), { | ||
| headers: { 'Content-Type': 'application/json' } | ||
| }); | ||
| } catch (error) { | ||
| return new Response( | ||
| JSON.stringify({ | ||
| error: error instanceof Error ? error.message : 'Internal server error', | ||
| naturalResponse: 'An error occurred while processing your request.', | ||
| commandResults: [], | ||
| fileOperations: [] | ||
| }), | ||
| { | ||
| status: 500, | ||
| headers: { 'Content-Type': 'application/json' } | ||
| } | ||
| ); | ||
| } | ||
| } | ||
|
|
||
| export default { | ||
| async fetch(request: Request, env: Env): Promise<Response> { | ||
| const url = new URL(request.url); | ||
|
|
||
| if (url.pathname === '/run' && request.method === 'POST') { | ||
| return handleRunRequest(request, env); | ||
| } | ||
|
|
||
| return new Response('Not found', { status: 404 }); | ||
| } | ||
| }; | ||
| ``` | ||
|
|
||
| ## Result Tracking | ||
|
|
||
| Both adapters automatically track all operations with timestamps. This makes it easy to: | ||
|
|
||
| - **Audit operations**: See exactly what commands were run and files were modified | ||
| - **Debug issues**: Identify which operation failed and when | ||
| - **Build UIs**: Display a timeline of agent actions | ||
| - **Logging**: Export operation history for analysis | ||
|
|
||
| ### Combining Results | ||
|
|
||
| You can combine and sort results from both adapters: | ||
|
|
||
| ```typescript | ||
| const allResults = [ | ||
| ...shell.results.map((r) => ({ type: 'command' as const, ...r })), | ||
| ...editor.results.map((r) => ({ type: 'file' as const, ...r })) | ||
| ].sort((a, b) => a.timestamp - b.timestamp); | ||
|
|
||
| // allResults is now a chronological list of all operations | ||
| ``` | ||
|
|
||
| ## Best Practices | ||
|
|
||
| 1. **Reuse sandbox instances**: Create one sandbox instance and share it between Shell and Editor | ||
| 2. **Set appropriate timeouts**: Configure command timeouts based on expected operation duration | ||
| 3. **Handle errors gracefully**: Check `status` fields in results and handle `failed` operations | ||
| 4. **Validate paths**: The Editor already validates paths, but be aware of workspace boundaries | ||
| 5. **Monitor resource usage**: Large command outputs or file operations may impact performance | ||
|
|
||
| ## Limitations | ||
|
|
||
| - **Working directory**: Shell operations always execute in `/workspace` (or the configured root) | ||
| - **Path restrictions**: File operations are restricted to the workspace root | ||
| - **Sequential execution**: Commands execute sequentially, not in parallel | ||
| - **Timeout handling**: Timeouts stop further command execution in a batch | ||
|
|
||
| ## See Also | ||
|
|
||
| - [OpenAI Agents SDK Documentation](https://github.com/openai/openai-agents-js/) | ||
| - [Session Execution Architecture](./SESSION_EXECUTION.md) - Understanding how commands execute in sandboxes | ||
| - [Example Implementation](../examples/openai-agents/src/index.ts) - Full working example | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Important: Missing Cloudflare deployment guidance. Add section covering:
Example structure: ## Deploying to Production
### Cloudflare-Specific Considerations
1. **Durable Objects:** Sessions persist across requests
2. **Rate Limiting:** Use Workers rate limiting or KV
3. **Costs:** Container CPU time + OpenAI tokens + DO requests
4. **Security:** Use Cloudflare Access + audit logs in KV
### Required Configuration
Your `wrangler.jsonc`:
```jsonc
{
"durable_objects": {
"bindings": [
{ "name": "Sandbox", "class_name": "Sandbox" }
]
}
}For production, use secrets: wrangler secret put OPENAI_API_KEY |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| # This image is unique to this repo, and you'll never need it. | ||
| # Whenever you're integrating with sandbox SDK in your own project, | ||
| # you should use the official image instead: | ||
| # FROM docker.io/cloudflare/sandbox:0.5.0 | ||
| # FROM cloudflare/sandbox-test:0.5.0 | ||
|
|
||
| # On a mac, you might need to actively pick up the | ||
| # arm64 build of the image. | ||
| FROM --platform=linux/arm64 cloudflare/sandbox-test:0.5.0 | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hardcoded FROM cloudflare/sandbox-test:0.5.0 |
||
|
|
||
| # Required during local development to access exposed ports | ||
| EXPOSE 8080 | ||
| EXPOSE 3000 | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| # OpenAI Agents with Cloudflare Sandbox | ||
threepointone marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| A conversational AI assistant that executes shell commands and edits files in a Cloudflare Sandbox. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. README needs prominent security warning before anyone deploys this. The auto-approval of all AI operations is a significant risk. Suggest adding: ## Security Warning
**This example auto-approves all AI operations without human review.** The AI can:
- Execute ANY shell command
- Create, modify, or delete ANY file in /workspace
- No safety limits beyond the container itself
**Do not use in production without proper approval flows and rate limiting.** |
||
|
|
||
| ## Setup | ||
|
|
||
| Create a `.env` file with your OpenAI API key: | ||
|
|
||
| ``` | ||
| OPENAI_API_KEY=your-api-key-here | ||
| ``` | ||
|
|
||
| Then start the development server: | ||
|
|
||
| ```bash | ||
| npm start | ||
| ``` | ||
|
|
||
| ## Usage | ||
|
|
||
| Enter natural language commands in the chat interface. The assistant can: | ||
|
|
||
| - Execute shell commands | ||
| - Create, edit, and delete files | ||
|
|
||
| All conversations are saved in your browser's localStorage. | ||
|
|
||
| ## Deploy | ||
|
|
||
| ```bash | ||
| npm run deploy | ||
| ``` | ||
|
|
||
| ## Security Warning | ||
|
|
||
| **This example auto-approves all AI operations without human review.** The AI can: | ||
|
|
||
| - Execute ANY shell command | ||
| - Create, modify, or delete ANY file in /workspace | ||
| - No safety limits beyond the container itself | ||
|
|
||
| **Do not use in production without proper approval flows and rate limiting.** | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Security warning needs to be prominent at the top.
Add this immediately after the title:
Developers scanning quickly will miss the security implications if it's only in the example README.