Skip to content

Conversation

NathanFlurry
Copy link
Member

No description provided.

Copy link
Member Author

NathanFlurry commented Aug 13, 2025


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • merge queue - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link

claude bot commented Aug 13, 2025

Claude finished @NathanFlurry's task —— View job


Reviewing Pull Request: Engine Driver Implementation

I've completed my review of this PR that adds an engine driver to RivetKit. Here's my comprehensive feedback:

Review Tasks:

  • Read repository guidelines (CLAUDE.md)
  • Examine new engine driver implementation
  • Review deleted test fixture files
  • Check modified configuration files
  • Analyze test coverage
  • Assess code quality and best practices
  • Check for potential bugs and security issues
  • Provide comprehensive feedback

Summary

This PR adds a new "engine" driver implementation for RivetKit that integrates with an external engine service. The implementation includes both actor and manager drivers, along with proper configuration and testing setup.

🟢 Positive Aspects

Code Quality & Architecture

  • Clean abstraction: The driver properly implements the ActorDriver and ManagerDriver interfaces
  • Good separation of concerns: Actor lifecycle, request routing, and configuration are well-separated
  • Consistent error handling: Uses proper error types from ActorError base classes
  • Follows repository conventions: Adheres to CLAUDE.md guidelines for logging, error handling, and code structure

Implementation Details

  • Proper lifecycle management: Implements actor start/stop/sleep callbacks correctly
  • Request proxying: Clean implementation of request forwarding through guard port
  • WebSocket support: Handles both HTTP and WebSocket connections properly
  • Configuration: Well-structured config with sensible defaults

🟡 Areas for Improvement

1. Incomplete Implementation - packages/core/src/drivers/engine/actor-driver.ts

// Line 231-235: handleFetch returns 501 Not Implemented
async #handleFetch(actorId: string, request: Request): Promise<Response> {
  // TODO: Implement proper request handling through actor
  return new Response(JSON.stringify({ error: "Not implemented" }), {
    status: 501,
    headers: { "Content-Type": "application/json" },
  });
}

Issue: Critical functionality is stubbed out
Recommendation: Either implement this or add clear documentation about limitations

2. Missing Error Context - packages/core/src/drivers/engine/actor-driver.ts:192-194

name,
keys,
"unknown", // TODO: Support regions in engine

Issue: Hardcoded region value reduces observability
Recommendation: Add proper region configuration or at least use a constant

3. Type Safety Concerns - packages/core/src/drivers/engine/actor-driver.ts:167

const actorConfig = config as any;

Issue: Casting to any defeats TypeScript's type safety
Recommendation: Define proper interfaces for actor configuration

4. Missing Validation - packages/core/src/drivers/engine/manager-driver.ts:276

const result = (await response.json()) as any;
const actorId = result.actor.actor_id as string;

Issue: No validation of API response structure
Recommendation: Add proper response validation with zod schemas

🔴 Potential Issues

1. Race Condition Risk - packages/core/src/drivers/engine/actor-driver.ts:96-107

handler.actorPromise = Promise.withResolvers();
await handler.actorPromise.promise;

Issue: If onActorStart never gets called, this will hang indefinitely
Recommendation: Add timeout mechanism or proper error handling

2. Memory Leak Potential - packages/core/src/drivers/engine/manager-driver.ts:27

#actors: Map<string, ActorData> = new Map();

Issue: Actors are never removed from memory (only on explicit destroy)
Recommendation: Implement cleanup mechanism for inactive actors

3. Hardcoded Configuration - packages/core/src/drivers/engine/manager-driver.ts:42

const guardUrl = `http://localhost:${this.#config.guardPort}${url.pathname}${url.search}`;

Issue: Hardcoded localhost may not work in all deployment scenarios
Recommendation: Make guard host configurable

🧪 Test Coverage

Positive:

  • Uses the existing driver test suite for comprehensive coverage
  • Properly configures test environment with random ports to avoid conflicts
  • Skips irrelevant tests (scheduling) appropriately

Concerns:

  • Dependency on external services: Tests require running engine service
  • Integration complexity: Real-timer tests may be flaky in CI environments
  • Missing unit tests: No isolated unit tests for individual driver components

🔒 Security Considerations

Minor Issues:

  1. Base64 encoding input: Input encoding in createActor may not handle all edge cases safely
  2. No input sanitization: WebSocket and HTTP headers are passed through without validation
  3. Localhost assumption: Hardcoded localhost could be a security issue in containerized environments

Recommendations:

  • Add input validation for all external API calls
  • Consider implementing proper authentication headers
  • Validate all response data from external services

🏗️ Missing Infrastructure

1. Driver Documentation

The new driver lacks documentation explaining:

  • When to use the engine driver vs other drivers
  • Configuration requirements
  • Dependencies on external engine service

2. Export in Main Module - ✅ Already implemented

Good: The driver is properly exported in packages/core/src/mod.ts:160

📋 Recommendations

High Priority:

  1. Complete the #handleFetch implementation or document limitations clearly
  2. Add proper response validation for all external API calls
  3. Implement timeout mechanisms for actor loading
  4. Add comprehensive error handling for network failures

Medium Priority:

  1. Replace any types with proper interfaces
  2. Make guard host configurable instead of hardcoding localhost
  3. Add cleanup mechanism for inactive actors
  4. Consider adding unit tests alongside integration tests

Low Priority:

  1. Add JSDoc comments for public APIs
  2. Consider implementing retry logic for external API calls
  3. Add metrics/observability hooks

Overall Assessment

This is a solid foundation for an engine driver with good architectural decisions. The main concerns are around incomplete implementations and edge case handling. The code follows repository conventions well and integrates properly with the existing driver framework.

Recommendation: Approve with requested changes to address the incomplete #handleFetch implementation and add proper error handling for edge cases.


@NathanFlurry NathanFlurry force-pushed the 08-13-feat_engine_add_engine_driver branch from 348d4b3 to 6a4ec8b Compare August 14, 2025 18:13
Copy link

claude bot commented Aug 14, 2025

Claude finished @NathanFlurry's task —— View job


Reviewing Pull Request: Engine Driver Implementation

Review Tasks:

  • Read repository guidelines (CLAUDE.md)
  • Examine new engine driver implementation
  • Review deleted test fixture files
  • Check modified configuration files
  • Analyze test coverage
  • Assess code quality and best practices
  • Check for potential bugs and security issues
  • Provide comprehensive feedback

Summary

This PR adds a new "engine" driver implementation for RivetKit that integrates with an external engine service. The implementation includes both actor and manager drivers, along with proper configuration and testing setup.

🟢 Positive Aspects

Code Quality & Architecture

  • Clean abstraction: The driver properly implements the ActorDriver and ManagerDriver interfaces
  • Good separation of concerns: Actor lifecycle, request routing, and configuration are well-separated
  • Consistent error handling: Uses proper error types from ActorError base classes
  • Follows repository conventions: Adheres to CLAUDE.md guidelines for logging, error handling, and code structure

Implementation Details

  • Proper lifecycle management: Implements actor start/stop/sleep callbacks correctly
  • Request proxying: Clean implementation of request forwarding through guard port
  • WebSocket support: Handles both HTTP and WebSocket connections properly
  • Configuration: Well-structured config with sensible defaults

🟡 Areas for Improvement

1. Typo in Method Name - packages/core/src/drivers/engine/actor-driver.ts:89

onActorSleep: this.#runerOnActorSleep.bind(this),

Issue: Method name has typo: #runerOnActorSleep should be #runnerOnActorSleep
Impact: Low - but inconsistent naming

2. Missing Error Context - packages/core/src/drivers/engine/actor-driver.ts:222

"unknown", // TODO: Add regions

Issue: Hardcoded region value reduces observability
Recommendation: Add proper region configuration or at least use a constant

3. Type Safety Concerns - packages/core/src/drivers/engine/manager-driver.ts:274-275

const result = (await response.json()) as any;
const actorId = result.actor.actor_id as string;

Issue: Casting to any defeats TypeScript's type safety
Recommendation: Define proper interfaces for API response validation

4. Missing Validation - Multiple instances in manager-driver.ts

Issue: No validation of API response structure from external engine service
Recommendation: Add proper response validation with zod schemas

🔴 Potential Issues

1. Race Condition Risk - packages/core/src/drivers/engine/actor-driver.ts:118-121

handler.actorPromise = Promise.withResolvers();
await handler.actorPromise.promise;

Issue: If onActorStart never gets called, this will hang indefinitely
Recommendation: Add timeout mechanism or proper error handling

2. Memory Leak Potential - packages/core/src/drivers/engine/manager-driver.ts:27

#actors: Map<string, ActorData> = new Map();

Issue: Actors are never removed from memory except on explicit destroy
Recommendation: Implement cleanup mechanism for inactive actors

3. Hardcoded Configuration - packages/core/src/drivers/engine/manager-driver.ts:36

const guardUrl = `http://localhost:${this.#config.guardPort}${url.pathname}${url.search}`;

Issue: Hardcoded localhost may not work in all deployment scenarios
Recommendation: Make guard host configurable

🧪 Test Coverage

Positive:

  • Uses the existing driver test suite for comprehensive coverage
  • Properly configures test environment with random ports to avoid conflicts
  • Skips irrelevant tests (scheduling) appropriately

Concerns:

  • Dependency on external services: Tests require running engine service
  • Integration complexity: Real-timer tests may be flaky in CI environments
  • Missing unit tests: No isolated unit tests for individual driver components

🔒 Security Considerations

Minor Issues:

  1. Base64 encoding input: Input encoding in createActor may not handle all edge cases safely
  2. No input sanitization: WebSocket and HTTP headers are passed through without validation
  3. Localhost assumption: Hardcoded localhost could be a security issue in containerized environments

Recommendations:

  • Add input validation for all external API calls
  • Consider implementing proper authentication headers
  • Validate all response data from external services

🏗️ Missing Infrastructure

1. Driver Documentation

The new driver lacks documentation explaining:

  • When to use the engine driver vs other drivers
  • Configuration requirements
  • Dependencies on external engine service

2. Export in Main Module - ✅ Already implemented

Good: The driver is properly exported in packages/core/src/mod.ts:11

📋 Recommendations

High Priority:

  1. Fix the typo in #runerOnActorSleep method name
  2. Add proper response validation for all external API calls
  3. Implement timeout mechanisms for actor loading
  4. Add comprehensive error handling for network failures

Medium Priority:

  1. Replace any types with proper interfaces
  2. Make guard host configurable instead of hardcoding localhost
  3. Add cleanup mechanism for inactive actors
  4. Consider adding unit tests alongside integration tests

Low Priority:

  1. Add JSDoc comments for public APIs
  2. Consider implementing retry logic for external API calls
  3. Add metrics/observability hooks

Overall Assessment

This is a solid foundation for an engine driver with good architectural decisions. The main concerns are around edge case handling and type safety. The code follows repository conventions well and integrates properly with the existing driver framework.

Recommendation: Approve with requested changes to address the typo and add proper error handling for edge cases.


fetch: this.#runnerFetch.bind(this),
onActorStart: this.#runnerOnActorStart.bind(this),
onActorStop: this.#runnerOnActorStop.bind(this),
onActorSleep: this.#runerOnActorSleep.bind(this),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be a typo in the method name: #runerOnActorSleep is missing an 'n' and should be #runnerOnActorSleep to maintain naming consistency with the other runner callback methods (#runnerFetch, #runnerOnActorStart, #runnerOnActorStop).

Suggested change
onActorSleep: this.#runerOnActorSleep.bind(this),
onActorSleep: this.#runnerOnActorSleep.bind(this),

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

@NathanFlurry NathanFlurry force-pushed the 08-13-feat_engine_add_engine_driver branch from 6a4ec8b to c2e7636 Compare August 15, 2025 04:48
Copy link

claude bot commented Aug 15, 2025

Claude finished @NathanFlurry's task —— View job


Reviewing Pull Request: Engine Driver Implementation

Review Tasks:

  • Read repository guidelines (CLAUDE.md)
  • Examine new engine driver implementation
  • Review deleted test fixture files
  • Check modified configuration files
  • Analyze test coverage
  • Assess code quality and best practices
  • Check for potential bugs and security issues
  • Provide comprehensive feedback

Summary

This PR adds a new "engine" driver implementation for RivetKit that integrates with an external engine service via the @rivetkit/engine-runner package. The implementation includes both actor and manager drivers, along with proper configuration and testing setup.

🟢 Positive Aspects

Code Quality & Architecture

  • Clean abstraction: The driver properly implements the ActorDriver and ManagerDriver interfaces
  • Good separation of concerns: Actor lifecycle, request routing, and configuration are well-separated
  • Consistent error handling: Uses proper error types from ActorError base classes
  • Follows repository conventions: Adheres to CLAUDE.md guidelines for logging, error handling, and code structure

Implementation Details

  • Proper lifecycle management: Implements actor start/stop callbacks correctly
  • Request proxying: Clean implementation of request forwarding through guard port
  • WebSocket support: Handles both HTTP and WebSocket connections properly
  • Configuration: Well-structured config with sensible defaults
  • Export integration: Properly exported in main module packages/core/src/mod.ts:11

🟡 Areas for Improvement

1. Missing Error Context - packages/core/src/drivers/engine/actor-driver.ts:222

"unknown", // TODO: Add regions

Issue: Hardcoded region value reduces observability
Recommendation: Add proper region configuration or at least use a constant

2. Type Safety Concerns - packages/core/src/drivers/engine/manager-driver.ts:403-404

const result = (await response.json()) as any;
const actorId = result.actor.actor_id as string;

Issue: Casting to any defeats TypeScript's type safety
Recommendation: Define proper interfaces for API response validation

3. Missing Validation - Multiple instances in manager-driver.ts

Issue: No validation of API response structure from external engine service
Recommendation: Add proper response validation with zod schemas

4. Incomplete Implementation - packages/core/src/drivers/engine/actor-driver.ts:165-167

async setAlarm(actor: AnyActorInstance, timestamp: number): Promise<void> {
  // Not implemented for engine driver
  throw new Error("Alarms not implemented for engine driver");
}

Issue: Critical functionality not implemented
Recommendation: Either implement this or document limitations clearly

🔴 Potential Issues

1. Race Condition Risk - packages/core/src/drivers/engine/actor-driver.ts:126

if (handler.actorStartPromise) await handler.actorStartPromise.promise;

Issue: If onActorStart never gets called, this will hang indefinitely
Recommendation: Add timeout mechanism or proper error handling

2. Memory Leak Potential - packages/core/src/drivers/engine/manager-driver.ts:40

#actors: Map<string, ActorData> = new Map();

Issue: Actors are never removed from memory except on explicit destroy
Recommendation: Implement cleanup mechanism for inactive actors

3. Hardcoded Configuration - packages/core/src/drivers/engine/manager-driver.ts:52

const guardUrl = `http://localhost:${this.#config.guardPort}${url.pathname}${url.search}`;

Issue: Hardcoded localhost may not work in all deployment scenarios
Recommendation: Make guard host configurable

🧪 Test Coverage

Positive:

  • Uses the existing driver test suite for comprehensive coverage
  • Properly configures test environment with random ports to avoid conflicts
  • Skips irrelevant tests (scheduling) appropriately
  • Uses real timers for proper integration testing

Concerns:

  • Dependency on external services: Tests require running engine service
  • Integration complexity: Real-timer tests may be flaky in CI environments
  • Missing unit tests: No isolated unit tests for individual driver components
  • Incomplete cleanup: Test cleanup is marked as TODO

🔒 Security Considerations

Minor Issues:

  1. Base64 encoding input: Input encoding in createActor may not handle all edge cases safely (line 384)
  2. No input sanitization: WebSocket and HTTP headers are passed through without validation
  3. Localhost assumption: Hardcoded localhost could be a security issue in containerized environments

Recommendations:

  • Add input validation for all external API calls
  • Consider implementing proper authentication headers
  • Validate all response data from external services

🏗️ Missing Infrastructure

1. Driver Documentation

The new driver lacks documentation explaining:

  • When to use the engine driver vs other drivers
  • Configuration requirements
  • Dependencies on external engine service

2. Key Serialization - packages/core/src/drivers/engine/utils.ts:3

// TODO: Need better key serialization

Issue: Current JSON serialization may not be robust for all key types

📋 Recommendations

High Priority:

  1. Add proper response validation for all external API calls using zod schemas
  2. Implement timeout mechanisms for actor loading to prevent hanging
  3. Add comprehensive error handling for network failures
  4. Make guard host configurable instead of hardcoding localhost

Medium Priority:

  1. Replace any types with proper interfaces
  2. Add cleanup mechanism for inactive actors
  3. Implement or properly document alarm functionality limitations
  4. Consider adding unit tests alongside integration tests

Low Priority:

  1. Add JSDoc comments for public APIs
  2. Consider implementing retry logic for external API calls
  3. Add metrics/observability hooks
  4. Improve key serialization mechanism

Overall Assessment

This is a solid foundation for an engine driver with good architectural decisions. The main concerns are around edge case handling, type safety, and missing validation. The code follows repository conventions well and integrates properly with the existing driver framework.

Recommendation: Approve with requested changes to address response validation and add proper error handling for edge cases.


Copy link

pkg-pr-new bot commented Aug 15, 2025

More templates

@rivetkit/actor

pnpm add https://pkg.pr.new/rivet-gg/rivetkit/@rivetkit/actor@1173

@rivetkit/core

pnpm add https://pkg.pr.new/rivet-gg/rivetkit/@rivetkit/core@1173

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-gg/rivetkit/@rivetkit/db@1173

rivetkit

pnpm add https://pkg.pr.new/rivet-gg/rivetkit@1173

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-gg/rivetkit/@rivetkit/cloudflare-workers@1173

@rivetkit/redis

pnpm add https://pkg.pr.new/rivet-gg/rivetkit/@rivetkit/redis@1173

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-gg/rivetkit/@rivetkit/framework-base@1173

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-gg/rivetkit/@rivetkit/next-js@1173

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-gg/rivetkit/@rivetkit/react@1173

commit: c2e7636

Comment on lines 84 to 75
cleanup: async () => {
// TODO: Cleanup actor driver
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup function should properly release resources by stopping the actor driver and deleting the test namespace. Consider implementing:

cleanup: async () => {
  // Stop the actor driver to release connections
  await actorDriver.stop();
  
  // Delete the test namespace
  const deleteResponse = await fetch(`${endpoint}/namespaces/${namespace}`, {
    method: "DELETE"
  });
  
  if (!deleteResponse.ok) {
    logger().warn("Failed to delete test namespace", { 
      namespace, 
      status: deleteResponse.status 
    });
  }
}

This ensures test resources are properly cleaned up between test runs.

Suggested change
cleanup: async () => {
// TODO: Cleanup actor driver
},
cleanup: async () => {
// Stop the actor driver to release connections
await actorDriver.stop();
// Delete the test namespace
const deleteResponse = await fetch(`${endpoint}/namespaces/${namespace}`, {
method: "DELETE"
});
if (!deleteResponse.ok) {
logger().warn("Failed to delete test namespace", {
namespace,
status: deleteResponse.status
});
}
},

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

@NathanFlurry NathanFlurry force-pushed the 08-13-feat_engine_add_engine_driver branch from c2e7636 to 4513f33 Compare August 16, 2025 04:37
Copy link

claude bot commented Aug 16, 2025

Claude finished @NathanFlurry's task —— View job


Reviewing Pull Request: Engine Driver Implementation

Review Tasks:

  • Read repository guidelines (CLAUDE.md)
  • Examine new engine driver implementation
  • Review deleted test fixture files
  • Check modified configuration files
  • Analyze test coverage
  • Assess code quality and best practices
  • Check for potential bugs and security issues
  • Provide comprehensive feedback

Summary

This PR adds a new "engine" driver implementation for RivetKit that integrates with an external engine service via the @rivetkit/engine-runner package. The implementation includes both actor and manager drivers, along with proper configuration and testing setup.

🟢 Positive Aspects

Code Quality & Architecture

  • Clean abstraction: The driver properly implements the ActorDriver and ManagerDriver interfaces
  • Good separation of concerns: Actor lifecycle, request routing, and configuration are well-separated
  • Consistent error handling: Uses proper error types from ActorError base classes and custom EngineApiError
  • Follows repository conventions: Adheres to CLAUDE.md guidelines for logging, error handling, and code structure
  • Proper export integration: Well-integrated into the main module exports

Implementation Details

  • Proper lifecycle management: Implements actor start/stop callbacks correctly with promise resolution
  • Request proxying: Clean implementation of request forwarding through guard system
  • WebSocket support: Handles both HTTP and WebSocket connections with bidirectional forwarding
  • Configuration: Well-structured config with sensible defaults and zod validation
  • API abstraction: Clean separation between API endpoints, utilities, and core driver logic

🟡 Areas for Improvement

1. Missing Error Context - packages/core/src/drivers/engine/actor-driver.ts:246

"unknown", // TODO: Add regions

Issue: Hardcoded region value reduces observability
Recommendation: Add proper region configuration or use a meaningful constant

2. Incomplete Alarm Implementation - packages/core/src/drivers/engine/actor-driver.ts:182-187

async setAlarm(actor: AnyActorInstance, timestamp: number): Promise<void> {
  // TODO: Set timeout
  // TODO: Use alarm on sleep  
  // TODO: Send alarm to runner
  throw new Error("Alarms not implemented for engine driver");
}

Issue: Critical scheduling functionality not implemented
Recommendation: Either implement this or document limitations clearly in driver docs

3. Key Serialization Concerns - packages/core/src/drivers/engine/utils.ts:3-11

// TODO: Need better key serialization
export function serializeActorKey(key: ActorKey): string {
  return JSON.stringify(key ?? {});
}
export function deserializeActorKey(key: string | undefined): ActorKey {
  return JSON.parse(key ?? "[]");
}

Issue: Basic JSON serialization may not handle all edge cases robustly
Recommendation: Consider more robust serialization for production use

4. Type Safety in API Responses - Multiple instances in manager-driver.ts

Issue: API responses aren't validated at runtime, only cast as types
Recommendation: Add zod schemas for API response validation for better type safety

🔴 Potential Issues

1. Race Condition Risk - packages/core/src/drivers/engine/actor-driver.ts:135

if (handler.actorStartPromise) await handler.actorStartPromise.promise;

Issue: If onActorStart never gets called, this will hang indefinitely
Recommendation: Add timeout mechanism or proper error handling with timeout

2. Memory Management - packages/core/src/drivers/engine/actor-driver.ts:67

#actors: Map<string, ActorHandler> = new Map();

Issue: Actors are only removed on explicit stop, potential for memory leaks
Recommendation: Consider cleanup mechanism for inactive/failed actors

3. Hardcoded Localhost - packages/core/src/drivers/engine/manager-driver.ts:48

const guardUrl = `${this.#config.endpoint}${url.pathname}${url.search}`;

Issue: While using config.endpoint is good, the default config has localhost hardcoded
Recommendation: Document deployment considerations or make more configurable

4. Incomplete Test Cleanup - packages/core/tests/driver-test-suite-engine.test.ts:81-83

cleanup: async () => {
  // TODO: Cleanup actor driver
},

Issue: No proper cleanup in tests, may lead to resource leaks
Recommendation: Implement proper cleanup to stop actor driver and delete test namespace

🧪 Test Coverage

Positive:

  • Comprehensive integration: Uses the existing driver test suite for thorough coverage
  • Real timer usage: Properly configured for integration testing with useRealTimers: true
  • Appropriate test skipping: Skips irrelevant tests (scheduling) that aren't applicable
  • Dynamic configuration: Uses environment variables with reasonable defaults

Areas for Improvement:

  • External dependency: Tests require running engine service, making them harder to run in isolation
  • Missing cleanup: TODO cleanup implementation could cause test interference
  • No unit tests: Only integration tests, missing isolated unit tests for individual components

🔒 Security Considerations

Minor Issues:

  1. Input validation: Base64 encoding in createActor (line 432) and API calls lack comprehensive input validation
  2. Header forwarding: WebSocket and HTTP headers are passed through without sanitization
  3. Error exposure: Some error details may leak internal information in API responses

Recommendations:

  • Add input validation for all external API calls using zod schemas
  • Consider implementing authentication/authorization headers if needed
  • Validate all response data from external services before processing

🏗️ Missing Infrastructure

1. Driver Documentation

The new driver lacks documentation explaining:

  • When to use the engine driver vs other drivers
  • Configuration requirements and deployment considerations
  • Dependencies on external engine service and setup instructions
  • Limitations (e.g., no alarm support)

2. Error Recovery

Missing comprehensive error recovery mechanisms for:

  • Network failures to engine service
  • Actor startup failures
  • Engine service disconnections

📋 Recommendations

High Priority:

  1. Implement test cleanup to prevent resource leaks and test interference
  2. Add timeout mechanisms for actor loading to prevent hanging
  3. Document limitations clearly, especially alarm functionality
  4. Add comprehensive error handling for network failures and edge cases

Medium Priority:

  1. Add runtime validation for API responses using zod schemas
  2. Implement proper cleanup mechanism for inactive actors
  3. Consider unit tests alongside integration tests for better coverage
  4. Improve key serialization for production robustness

Low Priority:

  1. Add JSDoc comments for public APIs
  2. Consider retry logic for external API calls
  3. Add metrics/observability hooks
  4. Document deployment considerations for different environments

Overall Assessment

This is a well-architected implementation that follows repository conventions and integrates cleanly with the existing driver framework. The code quality is high with good separation of concerns. The main concerns are around incomplete implementations (alarms, test cleanup), missing validation, and potential edge case handling. The driver provides a solid foundation for engine-based actor management.

Recommendation: Approve with requested changes to address test cleanup, timeout handling, and documentation of limitations.


Comment on lines 81 to 75
cleanup: async () => {
// TODO: Cleanup actor driver
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup function is missing implementation for stopping the actor driver. This could lead to resource leaks since the engine runner instance created earlier will continue running. Consider adding await actorDriver.stop() to properly clean up resources during test teardown.

Suggested change
cleanup: async () => {
// TODO: Cleanup actor driver
},
cleanup: async () => {
await actorDriver.stop();
},

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

@NathanFlurry NathanFlurry force-pushed the 08-13-feat_engine_add_engine_driver branch from 4513f33 to c6ffb2b Compare August 17, 2025 01:39
Copy link

claude bot commented Aug 17, 2025

Claude finished @NathanFlurry's task —— View job


Reviewing Pull Request: Engine Driver Implementation

Review Tasks:

  • Read repository guidelines (CLAUDE.md)
  • Examine new engine driver implementation
  • Review deleted test fixture files
  • Check modified configuration files
  • Analyze test coverage
  • Assess code quality and best practices
  • Check for potential bugs and security issues
  • Provide comprehensive feedback

Summary

This PR adds a new "engine" driver implementation for RivetKit that integrates with an external engine service via the @rivetkit/engine-runner package. The implementation includes both actor and manager drivers, along with proper configuration and testing setup.

🟢 Positive Aspects

Code Quality & Architecture

  • Clean abstraction: The driver properly implements the ActorDriver and ManagerDriver interfaces
  • Good separation of concerns: Actor lifecycle, request routing, and configuration are well-separated
  • Consistent error handling: Uses proper error types from ActorError base classes and custom EngineApiError
  • Follows repository conventions: Adheres to CLAUDE.md guidelines for logging, error handling, and code structure
  • Proper export integration: Well-integrated into the main module exports

Implementation Details

  • Proper lifecycle management: Implements actor start/stop callbacks correctly with promise resolution
  • Request proxying: Clean implementation of request forwarding through guard system
  • WebSocket support: Handles both HTTP and WebSocket connections with bidirectional forwarding
  • Configuration: Well-structured config with sensible defaults and zod validation
  • API abstraction: Clean separation between API endpoints, utilities, and core driver logic

🟡 Areas for Improvement

1. Missing Error Context - packages/core/src/drivers/engine/actor-driver.ts:246

"unknown", // TODO: Add regions

Issue: Hardcoded region value reduces observability
Recommendation: Add proper region configuration or use a meaningful constant

2. Incomplete Alarm Implementation - packages/core/src/drivers/engine/actor-driver.ts:182-187

async setAlarm(actor: AnyActorInstance, timestamp: number): Promise<void> {
  // TODO: Set timeout
  // TODO: Use alarm on sleep  
  // TODO: Send alarm to runner
  throw new Error("Alarms not implemented for engine driver");
}

Issue: Critical scheduling functionality not implemented
Recommendation: Either implement this or document limitations clearly in driver docs

3. Key Serialization Concerns - packages/core/src/drivers/engine/utils.ts:3-11

// TODO: Need better key serialization
export function serializeActorKey(key: ActorKey): string {
  return JSON.stringify(key ?? {});
}
export function deserializeActorKey(key: string | undefined): ActorKey {
  return JSON.parse(key ?? "[]");
}

Issue: Basic JSON serialization may not handle all edge cases robustly
Recommendation: Consider more robust serialization for production use

4. Type Safety in API Responses - Multiple instances in manager-driver.ts

Issue: API responses aren't validated at runtime, only cast as types
Recommendation: Add zod schemas for API response validation for better type safety

🔴 Potential Issues

1. Race Condition Risk - packages/core/src/drivers/engine/actor-driver.ts:135

if (handler.actorStartPromise) await handler.actorStartPromise.promise;

Issue: If onActorStart never gets called, this will hang indefinitely
Recommendation: Add timeout mechanism or proper error handling with timeout

2. Memory Management - packages/core/src/drivers/engine/actor-driver.ts:67

#actors: Map<string, ActorHandler> = new Map();

Issue: Actors are only removed on explicit stop, potential for memory leaks
Recommendation: Consider cleanup mechanism for inactive/failed actors

3. Hardcoded Localhost - packages/core/src/drivers/engine/manager-driver.ts:48

const guardUrl = `${this.#config.endpoint}${url.pathname}${url.search}`;

Issue: While using config.endpoint is good, the default config has localhost hardcoded
Recommendation: Document deployment considerations or make more configurable

4. Incomplete Test Cleanup - packages/core/tests/driver-test-suite-engine.test.ts:81-83

cleanup: async () => {
  // TODO: Cleanup actor driver
},

Issue: No proper cleanup in tests, may lead to resource leaks
Recommendation: Implement proper cleanup to stop actor driver and delete test namespace

🧪 Test Coverage

Positive:

  • Comprehensive integration: Uses the existing driver test suite for thorough coverage
  • Real timer usage: Properly configured for integration testing with useRealTimers: true
  • Appropriate test skipping: Skips irrelevant tests (scheduling) that aren't applicable
  • Dynamic configuration: Uses environment variables with reasonable defaults

Areas for Improvement:

  • External dependency: Tests require running engine service, making them harder to run in isolation
  • Missing cleanup: TODO cleanup implementation could cause test interference
  • No unit tests: Only integration tests, missing isolated unit tests for individual components

🔒 Security Considerations

Minor Issues:

  1. Input validation: Base64 encoding in createActor (line 432) and API calls lack comprehensive input validation
  2. Header forwarding: WebSocket and HTTP headers are passed through without sanitization
  3. Error exposure: Some error details may leak internal information in API responses

Recommendations:

  • Add input validation for all external API calls using zod schemas
  • Consider implementing authentication/authorization headers if needed
  • Validate all response data from external services before processing

🏗️ Missing Infrastructure

1. Driver Documentation

The new driver lacks documentation explaining:

  • When to use the engine driver vs other drivers
  • Configuration requirements and deployment considerations
  • Dependencies on external engine service and setup instructions
  • Limitations (e.g., no alarm support)

2. Error Recovery

Missing comprehensive error recovery mechanisms for:

  • Network failures to engine service
  • Actor startup failures
  • Engine service disconnections

📋 Recommendations

High Priority:

  1. Implement test cleanup to prevent resource leaks and test interference
  2. Add timeout mechanisms for actor loading to prevent hanging
  3. Document limitations clearly, especially alarm functionality
  4. Add comprehensive error handling for network failures and edge cases

Medium Priority:

  1. Add runtime validation for API responses using zod schemas
  2. Implement proper cleanup mechanism for inactive actors
  3. Consider unit tests alongside integration tests for better coverage
  4. Improve key serialization for production robustness

Low Priority:

  1. Add JSDoc comments for public APIs
  2. Consider retry logic for external API calls
  3. Add metrics/observability hooks
  4. Document deployment considerations for different environments

Overall Assessment

This is a well-architected implementation that follows repository conventions and integrates cleanly with the existing driver framework. The code quality is high with good separation of concerns. The main concerns are around incomplete implementations (alarms, test cleanup), missing validation, and potential edge case handling. The driver provides a solid foundation for engine-based actor management.

Recommendation: Approve with requested changes to address test cleanup, timeout handling, and documentation of limitations.


@NathanFlurry NathanFlurry force-pushed the 08-13-feat_engine_add_engine_driver branch from c6ffb2b to f04a99a Compare August 17, 2025 08:12
Copy link

claude bot commented Aug 17, 2025

Claude finished @NathanFlurry's task —— View job


Reviewing Pull Request: Engine Driver Implementation

Review Tasks:

  • Read repository guidelines (CLAUDE.md)
  • Examine new engine driver implementation
  • Review deleted test fixture files
  • Check modified configuration files
  • Analyze test coverage
  • Assess code quality and best practices
  • Check for potential bugs and security issues
  • Provide comprehensive feedback

Summary

This PR adds a new "engine" driver implementation for RivetKit that integrates with an external engine service via the @rivetkit/engine-runner package. The implementation includes both actor and manager drivers, along with proper configuration and testing setup.

🟢 Positive Aspects

Code Quality & Architecture

  • Clean abstraction: The driver properly implements the ActorDriver and ManagerDriver interfaces
  • Good separation of concerns: Actor lifecycle, request routing, and configuration are well-separated
  • Consistent error handling: Uses proper error types from ActorError base classes and custom EngineApiError
  • Follows repository conventions: Adheres to CLAUDE.md guidelines for logging, error handling, and code structure
  • Proper export integration: Well-integrated into the main module exports

Implementation Details

  • Proper lifecycle management: Implements actor start/stop callbacks correctly with promise resolution
  • Request proxying: Clean implementation of request forwarding through guard system
  • WebSocket support: Handles both HTTP and WebSocket connections with bidirectional forwarding
  • Configuration: Well-structured config with sensible defaults and zod validation
  • API abstraction: Clean separation between API endpoints, utilities, and core driver logic

🟡 Areas for Improvement

1. Missing Error Context - packages/core/src/drivers/engine/actor-driver.ts:247

"unknown", // TODO: Add regions

Issue: Hardcoded region value reduces observability
Recommendation: Add proper region configuration or use a meaningful constant

2. Incomplete Alarm Implementation - packages/core/src/drivers/engine/actor-driver.ts:183-189

async setAlarm(actor: AnyActorInstance, timestamp: number): Promise<void> {
  // TODO: Set timeout
  // TODO: Use alarm on sleep  
  // TODO: Send alarm to runner
  throw new Error("Alarms not implemented for engine driver");
}

Issue: Critical scheduling functionality not implemented
Recommendation: Either implement this or document limitations clearly in driver docs

3. Key Serialization Concerns - packages/core/src/drivers/engine/utils.ts:3-11

// TODO: Need better key serialization
export function serializeActorKey(key: ActorKey): string {
  return JSON.stringify(key ?? {});
}
export function deserializeActorKey(key: string | undefined): ActorKey {
  return JSON.parse(key ?? "[]");
}

Issue: Basic JSON serialization may not handle all edge cases robustly
Recommendation: Consider more robust serialization for production use

4. Type Safety in API Responses - Multiple instances in manager-driver.ts

Issue: API responses aren't validated at runtime, only cast as types
Recommendation: Add zod schemas for API response validation for better type safety

🔴 Potential Issues

1. Race Condition Risk - packages/core/src/drivers/engine/actor-driver.ts:136

if (handler.actorStartPromise) await handler.actorStartPromise.promise;

Issue: If onActorStart never gets called, this will hang indefinitely
Recommendation: Add timeout mechanism or proper error handling with timeout

2. Memory Management - packages/core/src/drivers/engine/actor-driver.ts:67

#actors: Map<string, ActorHandler> = new Map();

Issue: Actors are only removed on explicit stop, potential for memory leaks
Recommendation: Consider cleanup mechanism for inactive/failed actors

3. Hardcoded Localhost - packages/core/src/drivers/engine/manager-driver.ts:456

const guardUrl = `${this.#config.endpoint}${url.pathname}${url.search}`;

Issue: While using config.endpoint is good, the default config has localhost hardcoded
Recommendation: Document deployment considerations or make more configurable

4. Incomplete Test Cleanup - packages/core/tests/driver-test-suite-engine.test.ts:73-75

cleanup: async () => {
  await actorDriver.shutdown?.(true);
},

Issue: No cleanup of test namespace, may lead to resource accumulation
Recommendation: Add namespace deletion for proper test cleanup

🧪 Test Coverage

Positive:

  • Comprehensive integration: Uses the existing driver test suite for thorough coverage
  • Real timer usage: Properly configured for integration testing with useRealTimers: true
  • Appropriate test skipping: Skips irrelevant tests (scheduling) that aren't applicable
  • Dynamic configuration: Uses environment variables with reasonable defaults

Areas for Improvement:

  • External dependency: Tests require running engine service, making them harder to run in isolation
  • Missing namespace cleanup: Test namespaces aren't deleted after test completion
  • No unit tests: Only integration tests, missing isolated unit tests for individual components

🔒 Security Considerations

Minor Issues:

  1. Input validation: Base64 encoding in createActor (line 389) and API calls lack comprehensive input validation
  2. Header forwarding: WebSocket and HTTP headers are passed through without sanitization
  3. Error exposure: Some error details may leak internal information in API responses

Recommendations:

  • Add input validation for all external API calls using zod schemas
  • Consider implementing authentication/authorization headers if needed
  • Validate all response data from external services before processing

🏗️ Missing Infrastructure

1. Driver Documentation

The new driver lacks documentation explaining:

  • When to use the engine driver vs other drivers
  • Configuration requirements and deployment considerations
  • Dependencies on external engine service and setup instructions
  • Limitations (e.g., no alarm support)

2. Error Recovery

Missing comprehensive error recovery mechanisms for:

  • Network failures to engine service
  • Actor startup failures
  • Engine service disconnections

📋 Recommendations

High Priority:

  1. Implement test cleanup to prevent resource leaks and test interference
  2. Add timeout mechanisms for actor loading to prevent hanging
  3. Document limitations clearly, especially alarm functionality
  4. Add comprehensive error handling for network failures and edge cases

Medium Priority:

  1. Add runtime validation for API responses using zod schemas
  2. Implement proper cleanup mechanism for inactive actors
  3. Consider unit tests alongside integration tests for better coverage
  4. Improve key serialization for production robustness

Low Priority:

  1. Add JSDoc comments for public APIs
  2. Consider retry logic for external API calls
  3. Add metrics/observability hooks
  4. Document deployment considerations for different environments

Overall Assessment

This is a well-architected implementation that follows repository conventions and integrates cleanly with the existing driver framework. The code quality is high with good separation of concerns. The main concerns are around incomplete implementations (alarms, test cleanup), missing validation, and potential edge case handling. The driver provides a solid foundation for engine-based actor management.

Recommendation: Approve with requested changes to address test cleanup, timeout handling, and documentation of limitations.


@NathanFlurry NathanFlurry force-pushed the 08-13-feat_engine_add_engine_driver branch from f04a99a to 5a74e24 Compare August 18, 2025 07:27
Copy link

claude bot commented Aug 18, 2025

Claude finished @NathanFlurry's task —— View job


Reviewing Pull Request: Engine Driver Implementation

Review Tasks:

  • Read repository guidelines (CLAUDE.md)
  • Examine new engine driver implementation
  • Review deleted test fixture files
  • Check modified configuration files
  • Analyze test coverage
  • Assess code quality and best practices
  • Check for potential bugs and security issues
  • Provide comprehensive feedback

Summary

This PR adds a new "engine" driver implementation for RivetKit that integrates with an external engine service via the @rivetkit/engine-runner package. The implementation includes both actor and manager drivers, along with proper configuration and testing setup.

🟢 Positive Aspects

Code Quality & Architecture

  • Clean abstraction: The driver properly implements the ActorDriver and ManagerDriver interfaces
  • Good separation of concerns: Actor lifecycle, request routing, and configuration are well-separated
  • Consistent error handling: Uses proper error types from ActorError base classes and custom EngineApiError
  • Follows repository conventions: Adheres to CLAUDE.md guidelines for logging, error handling, and code structure
  • Proper export integration: Well-integrated into the main module exports

Implementation Details

  • Proper lifecycle management: Implements actor start/stop callbacks correctly with promise resolution
  • Request proxying: Clean implementation of request forwarding through guard system
  • WebSocket support: Handles both HTTP and WebSocket connections with bidirectional forwarding
  • Configuration: Well-structured config with sensible defaults and zod validation
  • API abstraction: Clean separation between API endpoints, utilities, and core driver logic
  • Key serialization: Robust key serialization/deserialization with proper escaping

🟡 Areas for Improvement

1. Missing Error Context - packages/core/src/drivers/engine/actor-driver.ts:248

"unknown", // TODO: Add regions

Issue: Hardcoded region value reduces observability
Recommendation: Add proper region configuration or use a meaningful constant

2. Incomplete Alarm Implementation - packages/core/src/drivers/engine/actor-driver.ts:184-189

async setAlarm(actor: AnyActorInstance, timestamp: number): Promise<void> {
  // TODO: Set timeout
  // TODO: Use alarm on sleep  
  // TODO: Send alarm to runner
  throw new Error("Alarms not implemented for engine driver");
}

Issue: Critical scheduling functionality not implemented
Recommendation: Either implement this or document limitations clearly in driver docs

3. Test Cleanup - packages/core/tests/driver-test-suite-engine.test.ts:73-75

cleanup: async () => {
  await actorDriver.shutdown?.(true);
},

Issue: No cleanup of test namespace, may lead to resource accumulation
Recommendation: Add namespace deletion for proper test cleanup

4. Memory Buffer Warning - packages/core/src/drivers/engine/manager-driver.ts:320-324

// TODO: This buffers the entire request in memory every time. We
// need to properly implement streaming bodies.

Issue: Performance concern for large request bodies
Recommendation: Consider implementing streaming for production use

🔴 Potential Issues

1. Race Condition Risk - packages/core/src/drivers/engine/actor-driver.ts:137

if (handler.actorStartPromise) await handler.actorStartPromise.promise;

Issue: If onActorStart never gets called, this will hang indefinitely
Recommendation: Add timeout mechanism or proper error handling with timeout

2. Memory Management - packages/core/src/drivers/engine/actor-driver.ts:68

#actors: Map<string, ActorHandler> = new Map();

Issue: Actors are only removed on explicit stop, potential for memory leaks
Recommendation: Consider cleanup mechanism for inactive/failed actors

3. Type Safety in API Responses - Multiple instances in manager-driver.ts

Issue: API responses aren't validated at runtime, only cast as types
Recommendation: Add zod schemas for API response validation for better type safety

4. Base64 Input Encoding - packages/core/src/drivers/engine/manager-driver.ts:238

input: input ? cbor.encode(actorInput).toString("base64") : undefined,

Issue: Large inputs will consume significant memory and may hit size limits
Recommendation: Consider alternative input handling for large payloads

🧪 Test Coverage

Positive:

  • Comprehensive integration: Uses the existing driver test suite for thorough coverage
  • Real timer usage: Properly configured for integration testing with useRealTimers: true
  • Appropriate test skipping: Skips irrelevant tests (scheduling) that aren't applicable
  • Dynamic configuration: Uses environment variables with reasonable defaults
  • Namespace isolation: Creates unique test namespaces to avoid conflicts

Areas for Improvement:

  • External dependency: Tests require running engine service, making them harder to run in isolation
  • Missing namespace cleanup: Test namespaces aren't deleted after test completion
  • No unit tests: Only integration tests, missing isolated unit tests for individual components

🔒 Security Considerations

Minor Issues:

  1. Input validation: Base64 encoding in API calls lacks comprehensive input validation
  2. Header forwarding: WebSocket and HTTP headers are passed through without sanitization
  3. Error exposure: Some error details may leak internal information in API responses

Recommendations:

  • Add input validation for all external API calls using zod schemas
  • Consider implementing authentication/authorization headers if needed
  • Validate all response data from external services before processing

🏗️ Missing Infrastructure

1. Driver Documentation

The new driver lacks documentation explaining:

  • When to use the engine driver vs other drivers
  • Configuration requirements and deployment considerations
  • Dependencies on external engine service and setup instructions
  • Limitations (e.g., no alarm support, request buffering)

2. Error Recovery

Missing comprehensive error recovery mechanisms for:

  • Network failures to engine service
  • Actor startup failures
  • Engine service disconnections

📋 Recommendations

High Priority:

  1. Implement test cleanup to prevent resource leaks and test interference
  2. Add timeout mechanisms for actor loading to prevent hanging
  3. Document limitations clearly, especially alarm functionality
  4. Add comprehensive error handling for network failures and edge cases

Medium Priority:

  1. Add runtime validation for API responses using zod schemas
  2. Implement proper cleanup mechanism for inactive actors
  3. Consider unit tests alongside integration tests for better coverage
  4. Add region configuration instead of hardcoded "unknown"

Low Priority:

  1. Add JSDoc comments for public APIs
  2. Consider retry logic for external API calls
  3. Add metrics/observability hooks
  4. Implement streaming bodies for large request handling

Overall Assessment

This is a well-architected implementation that follows repository conventions and integrates cleanly with the existing driver framework. The code quality is high with good separation of concerns. The main concerns are around incomplete implementations (alarms, test cleanup), missing validation, and potential edge case handling. The driver provides a solid foundation for engine-based actor management.

Recommendation: Approve with requested changes to address test cleanup, timeout handling, and documentation of limitations.


Copy link

graphite-app bot commented Sep 2, 2025

Merge activity

  • Sep 2, 9:12 AM UTC: NathanFlurry added this pull request to the Graphite merge queue.
  • Sep 2, 9:13 AM UTC: CI is running for this pull request on a draft pull request (#1190) due to your merge queue CI optimization settings.
  • Sep 2, 9:14 AM UTC: Merged by the Graphite merge queue via draft PR: #1190.

graphite-app bot pushed a commit that referenced this pull request Sep 2, 2025
@graphite-app graphite-app bot closed this Sep 2, 2025
@graphite-app graphite-app bot deleted the 08-13-feat_engine_add_engine_driver branch September 2, 2025 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant