-
Notifications
You must be signed in to change notification settings - Fork 52
LCORE: 490 - Add e2e tests for health endpoints #445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LCORE: 490 - Add e2e tests for health endpoints #445
Conversation
WalkthroughRestores a disrupted llama-stack container after scenarios: the disruption step stops the container and records prior state; after_scenario restarts the container when needed and polls its /v1/health endpoint until healthy; health feature expectations adjusted for readiness (503) and liveness (200) under disruption. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Tester as Test Runner
participant Step as Step: llama_stack_connection_broken
participant Docker as Docker Engine
Note over Tester,Step: Disrupt llama-stack before readiness/liveness checks
Tester->>Step: Execute disruption step
Step->>Docker: docker inspect --format='{{.State.Running}}' llama-stack
alt Container running
Step->>Docker: docker stop llama-stack
Note right of Step: wait ~2s to ensure disruption
Step-->>Tester: context.llama_stack_was_running = True
else Not running
Step-->>Tester: context.llama_stack_was_running = False
end
sequenceDiagram
autonumber
participant Hooks as after_scenario Hook
participant Docker as Docker Engine
Note over Hooks,Docker: Restore llama-stack only if it was running before scenario
Hooks->>Hooks: if context.llama_stack_was_running
alt Was running
Hooks->>Docker: docker start llama-stack
Note right of Hooks: wait ~5s
loop Up to 6 attempts (5s interval)
Hooks->>Docker: docker exec llama-stack curl -f http://localhost:8321/v1/health
alt Health OK
Docker-->>Hooks: 200 OK -> stop polling
else Timeout/Failure
Docker-->>Hooks: failure/timeout -> log and sleep 5s
end
end
Note over Hooks: log warning if still unhealthy after attempts
else Was not running
Note right of Hooks: no action
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (6)
tests/e2e/features/steps/health.py (3)
29-31: Replace fixed sleep with a short poll to confirm the container actually stopped.A static 2s sleep is brittle and can still race under load. Polling for the container to be not-running is more reliable.
Apply this diff:
- # Wait a moment for the connection to be fully disrupted - time.sleep(2) + # Wait until the container is confirmed not running (max ~5s) + for _ in range(10): + chk = subprocess.run( + [ + "docker", "ps", + "--filter", "name=^/llama-stack$", + "--filter", "status=running", + "--format", "{{.ID}}", + ], + capture_output=True, + text=True, + check=False, + ) + if not chk.stdout.strip(): + break + time.sleep(0.5)
36-37: Broaden exception handling to cover missing docker binary and avoid noisy stacktraces.If docker is unavailable,
FileNotFoundErroris raised. Catch it explicitly, and avoid leavingllama_stack_was_runningTrue on failure.Apply this diff:
- except subprocess.CalledProcessError as e: - print(f"Warning: Could not disrupt Llama Stack connection: {e}") + except (subprocess.CalledProcessError, FileNotFoundError) as e: + context.llama_stack_was_running = False + print(f"Warning: Could not disrupt Llama Stack connection: {e}")
41-45: Unimplemented step: either implement or delete to prevent accidental use.This placeholder can cause confusion later. If not needed, remove; otherwise implement a minimal stop using docker/compose.
Example minimal implementation (only if truly needed):
def stop_service(context: Context) -> None: - """Stop service.""" - # TODO: add step implementation - assert context is not None + """Stop REST API service container.""" + try: + subprocess.run(["docker", "stop", "lightspeed-rest"], check=True, capture_output=True) + except (subprocess.CalledProcessError, FileNotFoundError) as e: + raise AssertionError(f"Failed to stop service: {e}")If you use docker compose, adjust to
docker compose stop <service>. I can wire this to your actual service name if you confirm it.tests/e2e/features/environment.py (1)
68-76: Ensure the wait messaging and final warning always execute on failed attempts.Given the fix above, keep the backoff semantics intact.
No code change required beyond the previous diff; comment retained to acknowledge the now-correct control flow.
tests/e2e/features/health.feature (2)
11-12: Polish scenario titles for grammar and clarity.Minor grammar fixes to make titles read cleanly. No functional impact.
Apply this diff:
- Scenario: Check if service report proper readiness state + Scenario: Check if service reports the proper readiness state @@ - Scenario: Check if service report proper liveness state + Scenario: Check if service reports the proper liveness state @@ - Scenario: Check if service report proper readiness state when llama stack is not available + Scenario: Check if service reports the proper readiness state when the llama-stack is not available @@ - Scenario: Check if service report proper liveness state even when llama stack is not available + Scenario: Check if service reports the proper liveness state even when the llama-stack is not availableAlso applies to: 29-30, 45-46, 56-57
45-46: Tag disruptive scenarios so they can be included/excluded explicitly in CI runs.Adding a tag lets you run these disruption tests conditionally (e.g., nightly) and makes their impact obvious in reports.
Apply this diff:
- Scenario: Check if service reports the proper readiness state when the llama-stack is not available + @llama_disruption + Scenario: Check if service reports the proper readiness state when the llama-stack is not available @@ - Scenario: Check if service reports the proper liveness state even when the llama-stack is not available + @llama_disruption + Scenario: Check if service reports the proper liveness state even when the llama-stack is not availableAlso applies to: 56-57
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
tests/e2e/features/environment.py(2 hunks)tests/e2e/features/health.feature(1 hunks)tests/e2e/features/steps/health.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
tests/e2e/features/steps/health.py (1)
15-27: Fix running-state check; current logic treats "false" as truthy and stops a non-running container
docker inspect -f '{{.State.Running}}'returns "true" or "false". Checking only for non-empty output will setllama_stack_was_running = Trueand attemptdocker stopeven when the container is already stopped. Also prefercheck=Falsefor the inspect call and deriveis_runningfrom both return code and stdout. Add a small stop timeout to speed CI.Apply:
- result = subprocess.run( - ["docker", "inspect", "-f", "{{.State.Running}}", "llama-stack"], - capture_output=True, - text=True, - check=True, - ) - - if result.stdout.strip(): + result = subprocess.run( + ["docker", "inspect", "-f", "{{.State.Running}}", "llama-stack"], + capture_output=True, + text=True, + check=False, + ) + + is_running = (result.returncode == 0 and result.stdout.strip().lower() == "true") + if is_running: context.llama_stack_was_running = True subprocess.run( - ["docker", "stop", "llama-stack"], check=True, capture_output=True + ["docker", "stop", "--time=5", "llama-stack"], check=True, capture_output=True )
🧹 Nitpick comments (4)
tests/e2e/features/steps/health.py (4)
29-31: Remove unnecessary sleep afterdocker stop
docker stopis synchronous; it returns after the container has stopped. The extra 2s sleep just slows the suite.- # Wait a moment for the connection to be fully disrupted - time.sleep(2) + # `docker stop` is synchronous; no extra wait needed
36-37: Handle missing Docker and surface stderr for easier triageCatching only
CalledProcessErrormisses environments without Docker (raisesFileNotFoundError). Also print captured stderr to help debugging failures.- except subprocess.CalledProcessError as e: - print(f"Warning: Could not disrupt Llama Stack connection: {e}") + except FileNotFoundError: + print("Warning: Docker not found in PATH; cannot disrupt Llama Stack connection.") + except subprocess.CalledProcessError as e: + err = getattr(e, "stderr", "") or "" + print(f"Warning: Could not disrupt Llama Stack connection: {e}. Stderr: {err}")
3-4: Parameterize container name via environment to ease local/CI variabilityCompose setups often alter container names. Let the name be overridden with
LLAMA_STACK_CONTAINER, defaulting tollama-stack.+import os import subprocess import time +CONTAINER_NAME = os.environ.get("LLAMA_STACK_CONTAINER", "llama-stack") @@ - result = subprocess.run( - ["docker", "inspect", "-f", "{{.State.Running}}", "llama-stack"], + result = subprocess.run( + ["docker", "inspect", "-f", "{{.State.Running}}", CONTAINER_NAME], @@ - subprocess.run( - ["docker", "stop", "--time=5", "llama-stack"], check=True, capture_output=True + subprocess.run( + ["docker", "stop", "--time=5", CONTAINER_NAME], check=True, capture_output=True )Also applies to: 17-18, 25-27
40-45: No Active References for “the service is stopped”The only occurrences of this step are commented out in tests/e2e/features/conversations.feature (lines 29 and 51), so there are no active scenarios depending on it. You can choose to:
- Remove or update the commented lines if you don’t plan to use this step.
- Implement the
stop_servicestep now (e.g., invoke the shared stop routine or the llama-stack disruptor) so it’s ready once these scenarios are re-enabled.Locations found:
- tests/e2e/features/conversations.feature:29
- tests/e2e/features/conversations.feature:51
Let me know if you’d like help wiring up the actual stop logic.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
tests/e2e/features/steps/health.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: e2e_tests
- GitHub Check: build-pr
c05ebdc to
f92755d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
tests/e2e/features/environment.py (2)
37-44: Reset the disruption flag in a finally block to avoid cross-scenario leakage; also handle missing Docker binary.Behave’s Context persists across scenarios; without clearing
context.llama_stack_was_running, unrelated scenarios may attempt a restart. Add afinallyto always clear the flag and broaden exception handling to includeFileNotFoundError(e.g., whendockerisn’t available).Apply this diff:
- if hasattr(context, "llama_stack_was_running") and context.llama_stack_was_running: - try: + if hasattr(context, "llama_stack_was_running") and context.llama_stack_was_running: + try: # Start the llama-stack container again subprocess.run( ["docker", "start", "llama-stack"], check=True, capture_output=True ) ... - except subprocess.CalledProcessError as e: - print(f"Warning: Could not restore Llama Stack connection: {e}") + except (subprocess.CalledProcessError, FileNotFoundError) as e: + print(f"Warning: Could not restore Llama Stack connection: {e}") + finally: + # Prevent leakage of the flag into subsequent scenarios + if hasattr(context, "llama_stack_was_running"): + delattr(context, "llama_stack_was_running")Also applies to: 81-82
50-66: Fix health polling:check=Trueaborts on first non-200; usecheck=Falseand keep retrying.With
check=True, curl returns non-zero for 5xx and raisesCalledProcessError, exiting the loop before retries. The subsequentreturncodecheck is moot. Switch tocheck=False, add curl’s-s/-Sand a per-request cap, and keep catchingTimeoutExpired.Apply this diff:
- for attempt in range(6): # Try for 30 seconds + for attempt in range(6): # Try for ~30s of polling (per-attempt timeout) try: result = subprocess.run( [ - "docker", - "exec", - "llama-stack", - "curl", - "-f", - "http://localhost:8321/v1/health", + "docker", "exec", "llama-stack", + "curl", "-sfS", "--max-time", "4", + "http://localhost:8321/v1/health", ], - capture_output=True, - timeout=5, - check=True, + capture_output=True, + text=True, + timeout=5, + check=False, ) if result.returncode == 0: print("✓ Llama Stack connection restored successfully") break except subprocess.TimeoutExpired: print(f"⏱Health check timed out on attempt {attempt + 1}/6") if attempt < 5: print( f"Waiting for Llama Stack to be healthy... (attempt {attempt + 1}/6)" ) time.sleep(5)Also applies to: 69-75
🧹 Nitpick comments (1)
tests/e2e/features/environment.py (1)
45-48: Optional: Remove the unconditional 5s sleep and start polling immediately.The initial fixed wait slows teardown. Polling already has timeouts and backoff; you can start attempts right away.
- print("Restoring Llama Stack connection...") - time.sleep(5) + print("Restoring Llama Stack connection...")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
tests/e2e/features/environment.py(2 hunks)tests/e2e/features/health.feature(1 hunks)tests/e2e/features/steps/health.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/e2e/features/health.feature
- tests/e2e/features/steps/health.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-25T09:11:38.678Z
Learnt from: radofuchs
PR: lightspeed-core/lightspeed-stack#445
File: tests/e2e/features/environment.py:0-0
Timestamp: 2025-08-25T09:11:38.678Z
Learning: In the Behave Python testing framework, the Context object is created once for the entire test run and reused across all features and scenarios. Custom attributes added to the context persist until explicitly cleared, so per-scenario state should be reset in hooks to maintain test isolation.
Applied to files:
tests/e2e/features/environment.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: e2e_tests
- GitHub Check: build-pr
🔇 Additional comments (1)
tests/e2e/features/environment.py (1)
10-11: LGTM: Minimal, targeted imports.The added imports are scoped to the new teardown logic and look appropriate.
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Added E2E tests for health endpoints
Type of change
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit