Convert lightspeed-core to async architecture #348

eranco74 · 2025-08-05T11:55:36Z

feat: Convert lightspeed-core to async architecture

Migrate endpoints from sync to async handlers
Remove legacy sync client infrastructure
Update unit tests for async compatibility

Description

This resolves blocking behavior in all endpoints except streaming_query which was already async, enabling proper concurrent request handling.

Type of change

Related Tickets & Documents

Related Issue #
Closes #
https://issues.redhat.com/browse/MGMT-21377

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- All API endpoints and client operations now use asynchronous execution, improving responsiveness and scalability.
Refactor
- Migrated all synchronous client logic and endpoints to asynchronous equivalents.
- Unified agent creation and retrieval logic for consistency across endpoints.
- Simplified and standardized client initialization and MCP registration to use only asynchronous interfaces.
- Centralized agent management by moving get_agent to a shared utility module.
Bug Fixes
- Improved error handling for system prompt validation and agent/session management.
Tests
- Updated all endpoint and client tests to support asynchronous execution.
- Removed obsolete tests for deprecated synchronous client logic and local get_agent implementations.
- Added comprehensive async tests for centralized agent creation and retrieval logic.
Chores
- Cleaned up unused imports and removed legacy synchronous code.

coderabbitai · 2025-08-05T11:55:42Z

Walkthrough

This change refactors the codebase to remove all usage of the synchronous LlamaStackClient and its holder, fully migrating to the asynchronous AsyncLlamaStackClient and AsyncLlamaStackClientHolder. All endpoint handlers, utility functions, and related tests are updated to async functions, with corresponding changes to mocking and dependency injection.

Changes

Cohort / File(s)	Change Summary
Async Client Migration: Core and Endpoints `src/app/endpoints/authorized.py`, `src/app/endpoints/conversations.py`, `src/app/endpoints/models.py`, `src/app/endpoints/query.py`, `src/app/endpoints/streaming_query.py`, `src/lightspeed_stack.py`, `src/metrics/utils.py`, `src/utils/common.py`, `src/client.py`	All endpoints and utilities migrated to use `AsyncLlamaStackClientHolder` exclusively; synchronous client and holder removed; all relevant functions and handlers made async; imports and dependency usage updated accordingly.
get_agent Refactor and Centralization `src/utils/endpoints.py`, `src/app/endpoints/query.py`, `src/app/endpoints/streaming_query.py`	Local `get_agent` implementations removed; a new unified async `get_agent` is introduced in `utils.endpoints.py` and imported where needed; session and agent handling logic centralized and updated.
Test Suite Async Refactor `tests/unit/app/endpoints/test_authorized.py`, `tests/unit/app/endpoints/test_conversations.py`, `tests/unit/app/endpoints/test_models.py`, `tests/unit/app/endpoints/test_query.py`, `tests/unit/app/endpoints/test_streaming_query.py`, `tests/unit/metrics/test_utis.py`, `tests/unit/test_client.py`, `tests/unit/utils/test_common.py`	All tests updated to use async/await patterns and `AsyncMock` for clients; tests for synchronous client and local `get_agent` removed; new async tests for centralized `get_agent` added; fixtures and patches updated to reflect async usage.
New get_agent Unit Tests `tests/unit/utils/test_endpoints.py`	Adds comprehensive async unit tests for the new `get_agent` function, covering various agent/session scenarios and tool parser configurations.
Test Fixture Addition `tests/unit/conftest.py`	Adds a new pytest fixture `prepare_agent_mocks` providing async mocks for client and agent to support async test cases.
Minor/Other `src/models/config.py`	Updates a comment directive for line length; no logic changes.

Sequence Diagram(s)

sequenceDiagram
    participant Client as API Client
    participant Endpoint as FastAPI Endpoint Handler
    participant Holder as AsyncLlamaStackClientHolder
    participant LlamaStack as AsyncLlamaStackClient
    participant Utils as utils.endpoints.get_agent

    Client->>Endpoint: Send API request (e.g., /query)
    Endpoint->>Holder: get_client()
    Holder-->>Endpoint: AsyncLlamaStackClient instance
    Endpoint->>LlamaStack: Await client methods (e.g., models.list(), sessions.list())
    Endpoint->>Utils: await get_agent(...)
    Utils->>LlamaStack: Try to retrieve existing agent/session
    Utils-->>Endpoint: (AsyncAgent, conversation_id, session_id)
    Endpoint-->>Client: Return API response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

#293: Migrates setup_model_metrics to async; the current PR fully removes sync client usage and always uses async, superseding this partial migration.
#244: Implements conversation endpoints with the synchronous client; this PR refactors those endpoints to async and replaces the client.
#317: Refactors get_agent and session mapping; this PR centralizes and updates get_agent for async client usage and removes local implementations.

Suggested reviewers

manstis
tisnik

Poem

A hop, a skip, async we go,
Synchronous code now laid low.
Clients await, agents align,
Shields and sessions intertwine.
Tests now dance in async delight—
The stack’s more nimble, swift, and light!
🐇✨

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

manstis

LGTM 👍 .. I had started the same yesterday but didn't get to update all the tests.

manstis · 2025-08-05T12:03:02Z

make verify to run the failing CI checks locally to make fixing them easier.

ldjebran

LGTM

eranco74 · 2025-08-05T14:07:40Z

unit and integration are passing.
e2e is failing, this still need some work...

coderabbitai

Actionable comments posted: 10

🧹 Nitpick comments (2)

tests/unit/metrics/test_utis.py (1)
10-17: Clean up redundant mock setup

The mock setup is redundant - lines 10-12 create a mock that gets immediately replaced by lines 14-17. Consider simplifying to:
-    mock_client = mocker.patch(
-        "client.AsyncLlamaStackClientHolder.get_client"
-    ).return_value
-    # Make sure the client is an AsyncMock for async methods
-    mock_client = mocker.AsyncMock()
-    mocker.patch(
-        "client.AsyncLlamaStackClientHolder.get_client", return_value=mock_client
-    )
+    # Make sure the client is an AsyncMock for async methods
+    mock_client = mocker.AsyncMock()
+    mocker.patch(
+        "client.AsyncLlamaStackClientHolder.get_client", return_value=mock_client
+    )
src/utils/endpoints.py (1)
63-102: Well-implemented async agent management with a minor concern.

The get_agent function properly handles agent retrieval, creation, and session management. The implementation correctly manages orphan agents and enables session persistence.

However, there's a code smell at line 93 where the private attribute _agent_id is directly manipulated.

Consider adding a public method to the AsyncAgent class to update the agent_id instead of accessing the private attribute:
-        agent._agent_id = conversation_id  # type: ignore[assignment]  # pylint: disable=protected-access
+        # Consider adding a public method like agent.update_id(conversation_id)
+        agent._agent_id = conversation_id  # type: ignore[assignment]  # pylint: disable=protected-access
This would require coordination with the llama_stack_client library maintainers to add proper API support.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5afee1 and 3fc4cc2.

📒 Files selected for processing (21)

scripts/generate_openapi_schema.py (1 hunks)
src/app/endpoints/authorized.py (2 hunks)
src/app/endpoints/conversations.py (5 hunks)
src/app/endpoints/models.py (3 hunks)
src/app/endpoints/query.py (8 hunks)
src/app/endpoints/streaming_query.py (2 hunks)
src/client.py (1 hunks)
src/lightspeed_stack.py (1 hunks)
src/metrics/utils.py (2 hunks)
src/models/config.py (1 hunks)
src/utils/common.py (2 hunks)
src/utils/endpoints.py (2 hunks)
tests/unit/app/endpoints/test_authorized.py (1 hunks)
tests/unit/app/endpoints/test_conversations.py (6 hunks)
tests/unit/app/endpoints/test_models.py (8 hunks)
tests/unit/app/endpoints/test_query.py (36 hunks)
tests/unit/app/endpoints/test_streaming_query.py (1 hunks)
tests/unit/metrics/test_utis.py (1 hunks)
tests/unit/test_client.py (1 hunks)
tests/unit/utils/test_common.py (5 hunks)
tests/unit/utils/test_endpoints.py (3 hunks)

🔇 Additional comments (38)

src/models/config.py (1)

95-95: LGTM: Linting directive update

The change from # noqa: C0301 (pylint) to # noqa: E501 (flake8/pycodestyle) for line length suppression is appropriate and aligns with standardizing linting tools.

scripts/generate_openapi_schema.py (2)

13-13: LGTM: Async client import update

The import change from synchronous LlamaStackClientHolder to AsyncLlamaStackClientHolder is consistent with the async migration.

19-21: LGTM: Proper async client initialization

The addition of asyncio import and using asyncio.run() to execute the async .load() method is the correct approach for initializing the async client in a synchronous context.

src/lightspeed_stack.py (2)

15-15: LGTM: Clean async-only client import

The import update to use only AsyncLlamaStackClientHolder correctly removes dependency on the synchronous client implementation.

72-75: LGTM: Proper async client initialization

The async client initialization using asyncio.run() is implemented correctly and the logging message accurately reflects the async client creation.

tests/unit/utils/test_common.py (2)

35-35: LGTM: Consistent async client mock updates

All mock patch targets have been correctly updated from LlamaStackClientHolder to AsyncLlamaStackClientHolder, maintaining consistency across all test functions.

Also applies to: 67-68, 115-116, 150-151, 214-215, 262-263

66-66: LGTM: Proper AsyncMock usage

The mock clients are correctly updated to use AsyncMock() to properly simulate asynchronous client behavior, which is essential for testing async functions.

Also applies to: 111-111, 149-149, 211-211

src/app/endpoints/models.py (2)

46-57: LGTM! Async conversion looks correct.

The function signature has been properly converted to async, and the client usage follows the expected pattern:

get_client() returns the client instance synchronously (correct)

await client.models.list() properly awaits the async method call

62-80: Error handling preserved correctly.

The existing error handling logic for APIConnectionError and general exceptions remains intact, maintaining the same HTTP status codes and error responses.

src/client.py (1)

10-10: Synchronous client removal verified – no remaining references found.

A project-wide search for LlamaStackClientHolder and LlamaStackClient (excluding the Async variants) returned zero hits. All imports and usages now reference only AsyncLlamaStackClient and AsyncLlamaStackClientHolder, confirming the sync client has been cleanly removed.

tests/unit/app/endpoints/test_authorized.py (1)

11-19: LGTM! Clean async test conversion.

The test has been properly converted to async with pytest.mark.asyncio and the handler call is correctly awaited. Using the MOCK_AUTH constant directly is a good simplification.

src/utils/common.py (2)

48-50: LGTM! Simplified to use async interface consistently.

Good refactoring - both library and service clients now use the same async interface, eliminating the need for separate sync/async code paths.

31-51: MCP registration is validated for both service and library clients.

Existing unit tests in tests/unit/utils/test_common.py exercise the default (service) branch (use_as_library_client=False) and verify toolgroups.list/register calls, while test_register_mcp_servers_async_with_library_client covers the async initialize path when use_as_library_client=True. No changes required.

src/metrics/utils.py (2)

16-16: LGTM! Simplified model listing to use async client consistently.

Good refactoring - the conditional logic has been removed and both library and service clients now use the same async interface for listing models.

13-16: Metrics setup coverage confirmed

Unit tests in tests/unit/metrics/test_utis.py already mock AsyncLlamaStackClientHolder.get_client and validate setup_model_metrics for default and non-default models. Since both library and service clients expose the same models.list() interface, this ensures metrics setup works under either deployment mode. No further changes needed.

src/app/endpoints/authorized.py (2)

6-6: LGTM: Proper FastAPI dependency injection import.

The addition of Depends import supports the conversion to FastAPI's dependency injection pattern.

33-39: LGTM: Clean async conversion with dependency injection.

The endpoint handler has been properly converted to async and now uses FastAPI's dependency injection instead of manually invoking the auth dependency with asyncio.run(). This aligns with FastAPI best practices and the broader async migration.

tests/unit/test_client.py (1)

5-5: LGTM: Import cleanup aligns with sync client removal.

The removal of LlamaStackClientHolder import is consistent with the migration to async-only architecture.

tests/unit/app/endpoints/test_conversations.py (2)

117-127: LGTM: Proper async test conversion.

The simplify_session_data tests have been correctly marked as async, even though the function itself remains synchronous. This ensures consistency across the test suite.

187-196: LGTM: Consistent async test pattern.

All test methods have been properly converted to async with correct @pytest.mark.asyncio decorator and awaited endpoint handler calls.

tests/unit/app/endpoints/test_models.py (2)

13-14: LGTM: Proper async test conversion.

The test method has been correctly converted to async with the appropriate decorator.

149-152: LGTM: Proper async mock usage.

The test correctly uses mocker.AsyncMock() for the async client and patches the right location.

src/app/endpoints/conversations.py (5)

10-10: LGTM: Import updated for async client.

The import has been correctly updated from LlamaStackClientHolder to AsyncLlamaStackClientHolder to support the async migration.

113-116: LGTM: Proper async endpoint handler conversion.

The endpoint handler has been correctly converted to async, maintaining the same parameter structure and dependency injection pattern.

135-137: LGTM: Correct async client usage.

The client instantiation and method call have been properly converted to use the async client with appropriate await keywords.

182-185: LGTM: Consistent async conversion.

The delete endpoint handler follows the same async conversion pattern as the get handler.

204-209: LGTM: Proper async client method calls.

The client instantiation and delete method call correctly use the async pattern with await.

src/app/endpoints/streaming_query.py (2)

25-25: LGTM! Good refactoring to centralize agent management.

The import of get_agent from utils.endpoints aligns with the architectural improvement to consolidate agent creation logic in a single location.

493-493: LGTM! Useful debug logging added.

The debug log for conversation and session IDs will help with troubleshooting and monitoring agent creation.

tests/unit/utils/test_endpoints.py (3)

84-110: LGTM! Well-structured test configuration fixture.

The configuration fixture provides a clean setup for testing with all necessary fields properly initialized.

177-266: Excellent test coverage for agent retrieval scenarios.

The tests properly cover both successful retrieval of existing agents and creation of new agents when not found. Good use of AsyncMock and proper assertion of all parameters.

481-533: Good coverage of no_tools flag behavior.

The test correctly verifies that tool_parser=None when no_tools=True, which is essential for the feature to bypass tools when requested.

src/utils/endpoints.py (1)

19-19: Parameter rename looks good.

The rename from configuration to config improves consistency and readability.

Also applies to: 28-28

tests/unit/app/endpoints/test_streaming_query.py (1)

143-143: LGTM! Correct migration to async client holder.

The change from LlamaStackClientHolder to AsyncLlamaStackClientHolder is consistent with the async architecture migration.

src/app/endpoints/query.py (4)

11-11: LGTM! Imports correctly updated for async architecture.

All necessary imports have been updated to their async counterparts, maintaining consistency with the architecture migration.

Also applies to: 21-21, 29-29

69-69: Async function signatures properly updated.

Both query_endpoint_handler and retrieve_response have been correctly converted to async functions.

Also applies to: 208-208

84-88: All async operations properly awaited.

The conversion correctly awaits all async operations:

Client initialization and model listing

Shield listing

Agent creation

Vector DB listing

Turn creation

No missing awaits detected.

Also applies to: 218-222, 241-249, 277-278, 286-286

251-251: Consistent debug logging added.

The debug log for conversation and session IDs matches the pattern used in streaming_query.py, maintaining consistency across endpoints.

tests/unit/app/endpoints/test_authorized.py

tests/unit/app/endpoints/test_conversations.py

tests/unit/app/endpoints/test_models.py

tests/unit/app/endpoints/test_query.py

coderabbitai

Actionable comments posted: 1

🔭 Outside diff range comments (2)

tests/unit/app/endpoints/test_query.py (1)

358-396: Critical: Missing 'prepare_agent_mocks' fixture definition.

Similar to the utils test file, this test references the prepare_agent_mocks fixture that isn't defined, causing pipeline failures across multiple test methods.

This fixture should be defined in a conftest.py file shared across test modules or added to each test file that uses it.
tests/unit/app/endpoints/test_streaming_query.py (1)
293-1435: Critical issue: Missing prepare_agent_mocks fixture breaks multiple tests.

The prepare_agent_mocks fixture has been removed but is still referenced by 11 test functions, causing all of them to fail. According to the AI summary, this fixture was deleted when get_agent tests were removed, but these remaining tests still depend on it for mocking agent and client behavior.

You need to either:

Restore the prepare_agent_mocks fixture definition, or

Update these tests to use inline mocking instead of the fixture, or

Move these tests to the appropriate location if they're testing centralized functionality
# Example fixture restoration:
@pytest.fixture(name="prepare_agent_mocks")
def prepare_agent_mocks_fixture(mocker):
    """Prepare agent mocks for testing."""
    mock_client = mocker.AsyncMock()
    mock_agent = mocker.AsyncMock()
    return mock_client, mock_agent

♻️ Duplicate comments (1)

tests/unit/app/endpoints/test_query.py (1)
59-73: Fix unreachable assertions in exception handling.

The assertions on lines 72-73 are unreachable because they come after the await statement that raises the exception. Move them outside the with block to properly verify the exception details.
 with pytest.raises(HTTPException) as e:
     await query_endpoint_handler(request)
-        assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
-        assert e.value.detail["response"] == "Configuration is not loaded"
+
+assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
+assert e.value.detail["response"] == "Configuration is not loaded"

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3fc4cc2 and a2d045d.

📒 Files selected for processing (21)

scripts/generate_openapi_schema.py (1 hunks)
src/app/endpoints/authorized.py (2 hunks)
src/app/endpoints/conversations.py (5 hunks)
src/app/endpoints/models.py (3 hunks)
src/app/endpoints/query.py (8 hunks)
src/app/endpoints/streaming_query.py (2 hunks)
src/client.py (1 hunks)
src/lightspeed_stack.py (1 hunks)
src/metrics/utils.py (2 hunks)
src/models/config.py (1 hunks)
src/utils/common.py (2 hunks)
src/utils/endpoints.py (2 hunks)
tests/unit/app/endpoints/test_authorized.py (1 hunks)
tests/unit/app/endpoints/test_conversations.py (6 hunks)
tests/unit/app/endpoints/test_models.py (8 hunks)
tests/unit/app/endpoints/test_query.py (36 hunks)
tests/unit/app/endpoints/test_streaming_query.py (1 hunks)
tests/unit/metrics/test_utis.py (1 hunks)
tests/unit/test_client.py (1 hunks)
tests/unit/utils/test_common.py (5 hunks)
tests/unit/utils/test_endpoints.py (3 hunks)

✅ Files skipped from review due to trivial changes (1)

tests/unit/metrics/test_utis.py

🚧 Files skipped from review as they are similar to previous changes (15)

scripts/generate_openapi_schema.py
tests/unit/app/endpoints/test_authorized.py
src/metrics/utils.py
src/app/endpoints/authorized.py
src/models/config.py
src/lightspeed_stack.py
tests/unit/utils/test_common.py
src/app/endpoints/models.py
src/app/endpoints/conversations.py
src/utils/common.py
tests/unit/app/endpoints/test_models.py
tests/unit/test_client.py
tests/unit/app/endpoints/test_conversations.py
src/client.py
src/app/endpoints/streaming_query.py

🧰 Additional context used

🪛 GitHub Actions: Unit tests

tests/unit/utils/test_endpoints.py

[error] 177-177: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 210-210: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 268-268: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 322-322: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 376-376: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 432-432: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 481-481: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 536-536: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

tests/unit/app/endpoints/test_query.py

[error] 358-358: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 399-399: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 435-435: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 484-484: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 536-536: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 602-602: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 651-651: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 709-709: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 783-783: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 837-837: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 932-932: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 1201-1201: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 1251-1251: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

tests/unit/app/endpoints/test_streaming_query.py

[error] 293-293: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 332-332: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 369-369: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 417-417: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 468-468: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 533-533: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 581-581: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 985-985: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 1058-1058: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 1329-1329: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

[error] 1379-1379: Pytest error: fixture 'prepare_agent_mocks' not found. This fixture is required for test setup but is missing or not defined.

🔇 Additional comments (14)

src/utils/endpoints.py (4)

3-13: LGTM! Clean async migration imports.

The import additions are appropriate for the async architecture migration, including the contextlib.suppress for graceful error handling and the async client/agent classes.

19-25: LGTM! Consistent parameter naming.

Good refactoring to standardize parameter names from configuration to config for consistency across the codebase.

28-59: LGTM! Improved error handling for system prompt validation.

The function now properly raises HTTP 422 errors when system prompt customization is disabled but a system prompt is provided. The logic flow and error messaging are clear.

63-102: Excellent async agent management with session persistence.

The get_agent function is well-designed with:

Proper error suppression for non-critical agent retrieval failures

Session persistence logic that reuses existing sessions or creates new ones

Clean separation of concerns between agent creation and session management

Good use of type hints and async/await patterns

The agent ID manipulation on line 93 uses protected access but is necessary for session continuity.

tests/unit/utils/test_endpoints.py (2)

84-109: LGTM! Well-structured test configuration fixture.

The setup_configuration fixture provides a comprehensive test configuration that mirrors the structure needed for the async architecture tests.

177-594: Comprehensive async test coverage for get_agent function.

The test suite excellently covers all scenarios:

Agent retrieval with existing conversation IDs

New agent creation when no existing agent found

Empty shields handling

Multiple MCP servers configuration

Session persistence verification

Tool parser behavior with no_tools flag

The async test patterns and mocking are implemented correctly. Once the missing fixture is resolved, these tests will provide excellent coverage.

tests/unit/app/endpoints/test_query.py (2)

97-134: Excellent async conversion of test helper function.

The _test_query_endpoint_handler function is properly converted to async with:

AsyncMock for client mocking

Correct patching of AsyncLlamaStackClientHolder

Proper await usage for async function calls

The test logic and assertions remain intact while adapting to the async pattern.

97-1322: Comprehensive async conversion with proper mock usage.

The entire test suite has been successfully converted to async patterns:

All endpoint handler tests use await correctly

Mock clients upgraded to AsyncMock for proper async behavior

Client holder patches updated to AsyncLlamaStackClientHolder.get_client

Test assertions and logic preserved while adapting to async execution

The conversion maintains test coverage while properly supporting the new async architecture.

tests/unit/app/endpoints/test_streaming_query.py (1)

143-143: LGTM! Correctly updated to async client holder.

This change properly updates the mock to use AsyncLlamaStackClientHolder instead of the synchronous version, which aligns with the overall migration to async architecture.

src/app/endpoints/query.py (5)

11-11: LGTM! Import changes correctly support async refactoring.

The imports have been properly updated to support the async architecture:

AsyncLlamaStackClient for async client operations

AsyncLlamaStackClientHolder replacing the sync version

get_agent now imported from centralized utils.endpoints

Also applies to: 21-21, 29-29

69-69: LGTM! Endpoint handler correctly converted to async.

The function signature has been properly updated to async to support awaiting I/O operations throughout the function.

84-88: LGTM! Client usage properly updated for async operations.

The client instantiation and method calls have been correctly updated:

Using AsyncLlamaStackClientHolder for async client management

Properly awaiting client.models.list() and retrieve_response() calls

Error handling structure preserved

208-302: LGTM! retrieve_response function comprehensively updated for async operation.

The function has been properly refactored with all necessary async changes:

Function signature updated to async with AsyncLlamaStackClient parameter

All I/O operations properly awaited (shields.list(), get_agent(), vector_dbs.list(), agent.create_turn())

Logic flow and error handling preserved

Useful debug logging added for conversation and session IDs

The async conversion is complete and follows best practices.

251-251: LGTM! Useful debug logging added.

The debug logging for conversation and session IDs provides valuable tracing information that will help with debugging and request monitoring.

tests/unit/utils/test_endpoints.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

tests/unit/app/endpoints/test_query.py (1)
71-74: Fix unreachable assertions in exception handling

The assertions on lines 72-73 are unreachable because they come after the await statement that raises the exception. Move them outside the with block to properly verify the exception details.
 with pytest.raises(HTTPException) as e:
     await query_endpoint_handler(request)
-        assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
-        assert e.value.detail["response"] == "Configuration is not loaded"
+
+assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
+assert e.value.detail["response"] == "Configuration is not loaded"

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a2d045d and ed63513.

📒 Files selected for processing (22)

scripts/generate_openapi_schema.py (1 hunks)
src/app/endpoints/authorized.py (2 hunks)
src/app/endpoints/conversations.py (5 hunks)
src/app/endpoints/models.py (3 hunks)
src/app/endpoints/query.py (8 hunks)
src/app/endpoints/streaming_query.py (2 hunks)
src/client.py (1 hunks)
src/lightspeed_stack.py (1 hunks)
src/metrics/utils.py (2 hunks)
src/models/config.py (1 hunks)
src/utils/common.py (2 hunks)
src/utils/endpoints.py (2 hunks)
tests/unit/app/endpoints/test_authorized.py (1 hunks)
tests/unit/app/endpoints/test_conversations.py (6 hunks)
tests/unit/app/endpoints/test_models.py (8 hunks)
tests/unit/app/endpoints/test_query.py (36 hunks)
tests/unit/app/endpoints/test_streaming_query.py (1 hunks)
tests/unit/conftest.py (1 hunks)
tests/unit/metrics/test_utis.py (1 hunks)
tests/unit/test_client.py (1 hunks)
tests/unit/utils/test_common.py (5 hunks)
tests/unit/utils/test_endpoints.py (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (18)

src/models/config.py
src/app/endpoints/models.py
scripts/generate_openapi_schema.py
tests/unit/metrics/test_utis.py
tests/unit/utils/test_common.py
src/metrics/utils.py
src/app/endpoints/authorized.py
tests/unit/test_client.py
src/utils/common.py
tests/unit/app/endpoints/test_authorized.py
src/lightspeed_stack.py
tests/unit/app/endpoints/test_models.py
src/client.py
src/app/endpoints/conversations.py
tests/unit/app/endpoints/test_conversations.py
src/app/endpoints/streaming_query.py
tests/unit/utils/test_endpoints.py
tests/unit/app/endpoints/test_streaming_query.py

🧰 Additional context used

🪛 GitHub Actions: Pydocstyle

tests/unit/conftest.py

[error] 10-10: pydocstyle D401: First line should be in imperative mood (perhaps 'Prepare', not 'Preparation') in public function prepare_agent_mocks_fixture.

🔇 Additional comments (9)

tests/unit/conftest.py (1)

8-28: Well-designed async fixture for agent mocking

The fixture provides a clean abstraction for async testing with proper setup of agent_id properties and create_turn mock structure. This will help maintain consistency across async test suites.

src/utils/endpoints.py (3)

19-19: LGTM! Parameter naming consistency

Good change to standardize parameter naming across the utility functions.

28-28: LGTM! Parameter naming consistency

Consistent parameter naming improves code readability.

63-102: Excellent async agent management implementation

The new async get_agent function is well-designed with:

Proper error handling using suppress for graceful failure on agent retrieval

Comprehensive session management with orphan agent cleanup

Clear separation of concerns between existing and new agent flows

Good logging for debugging conversation and session IDs

The implementation correctly handles the async nature of the LlamaStack client operations.

tests/unit/app/endpoints/test_query.py (1)

59-61: Excellent async test conversion

The conversion from synchronous to asynchronous tests is comprehensive and well-executed:

Proper use of @pytest.mark.asyncio decorator

All async function calls correctly awaited

Mock clients properly converted to AsyncMock

Client holder patches updated to use AsyncLlamaStackClientHolder

The test logic and assertions remain consistent while adapting to the new async architecture.

Also applies to: 97-135, 160-169, 358-397

src/app/endpoints/query.py (4)

11-11: LGTM! Proper async client imports

Good addition of async client imports to support the migration to asynchronous architecture.

Also applies to: 21-21

29-29: Excellent architectural improvement

Importing get_agent from utils.endpoints centralizes agent management logic and eliminates code duplication across endpoint handlers.

69-69: Comprehensive async conversion

The conversion to async functions is well-executed:

Endpoint handler properly converted to async

Client operations correctly awaited

Error handling and API contract preserved

Maintains existing functionality while enabling better concurrency

The async implementation will resolve the blocking behavior mentioned in the PR objectives.

Also applies to: 84-94, 208-209

218-223: Proper async I/O operations

All client operations (shields.list, vector_dbs.list, get_agent, create_turn) are correctly awaited, ensuring non-blocking execution throughout the request pipeline.

Also applies to: 241-251, 276-278, 286-292

tests/unit/conftest.py

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

tests/unit/app/endpoints/test_query.py (1)
71-73: Fix unreachable assertions in exception handling.

The assertions are still unreachable because they come after the await statement that raises the exception. This was flagged in a previous review but not properly fixed.
 with pytest.raises(HTTPException) as e:
     await query_endpoint_handler(request)
-        assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
-        assert e.value.detail["response"] == "Configuration is not loaded"
+
+assert e.value.status_code == status.HTTP_500_INTERNAL_SERVER_ERROR
+assert e.value.detail["response"] == "Configuration is not loaded"

🧹 Nitpick comments (2)

src/utils/endpoints.py (2)
75-77: Consider more specific exception handling.

Using suppress(ValueError) is quite broad. Consider catching more specific exceptions that client.agents.retrieve might raise when an agent doesn't exist.
-        with suppress(ValueError):
+        with suppress(AgentNotFoundError, ValueError):
             agent_response = await client.agents.retrieve(agent_id=conversation_id)
             existing_agent_id = agent_response.agent_id
62-102: Consider simplifying the complex agent management logic.

This function handles multiple responsibilities: agent creation, ID management, session handling, and cleanup. Consider breaking it into smaller, more focused functions for better maintainability.

Consider refactoring into:

create_new_agent()

handle_existing_agent()

setup_agent_session()

This would improve readability and make the logic easier to test and maintain.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d2d7ce and 8b20317.

📒 Files selected for processing (22)

scripts/generate_openapi_schema.py (1 hunks)
src/app/endpoints/authorized.py (2 hunks)
src/app/endpoints/conversations.py (5 hunks)
src/app/endpoints/models.py (3 hunks)
src/app/endpoints/query.py (8 hunks)
src/app/endpoints/streaming_query.py (2 hunks)
src/client.py (1 hunks)
src/lightspeed_stack.py (1 hunks)
src/metrics/utils.py (2 hunks)
src/models/config.py (1 hunks)
src/utils/common.py (2 hunks)
src/utils/endpoints.py (2 hunks)
tests/unit/app/endpoints/test_authorized.py (1 hunks)
tests/unit/app/endpoints/test_conversations.py (6 hunks)
tests/unit/app/endpoints/test_models.py (8 hunks)
tests/unit/app/endpoints/test_query.py (36 hunks)
tests/unit/app/endpoints/test_streaming_query.py (1 hunks)
tests/unit/conftest.py (1 hunks)
tests/unit/metrics/test_utis.py (1 hunks)
tests/unit/test_client.py (1 hunks)
tests/unit/utils/test_common.py (5 hunks)
tests/unit/utils/test_endpoints.py (3 hunks)

✅ Files skipped from review due to trivial changes (2)

scripts/generate_openapi_schema.py
tests/unit/conftest.py

🚧 Files skipped from review as they are similar to previous changes (17)

src/models/config.py
src/lightspeed_stack.py
tests/unit/metrics/test_utis.py
src/utils/common.py
src/app/endpoints/authorized.py
tests/unit/test_client.py
tests/unit/utils/test_common.py
src/app/endpoints/models.py
tests/unit/app/endpoints/test_conversations.py
tests/unit/app/endpoints/test_models.py
src/client.py
src/app/endpoints/conversations.py
src/metrics/utils.py
tests/unit/utils/test_endpoints.py
src/app/endpoints/streaming_query.py
src/app/endpoints/query.py
tests/unit/app/endpoints/test_streaming_query.py

🧰 Additional context used

🧬 Code Graph Analysis (2)

tests/unit/app/endpoints/test_authorized.py (2)

src/app/endpoints/authorized.py (1)

authorized_endpoint_handler (33-39)

src/auth/utils.py (1)

extract_user_token (7-26)

src/utils/endpoints.py (4)

src/models/requests.py (1)

QueryRequest (70-158)

src/configuration.py (3)

configuration (48-53)

AppConfig (21-114)

customization (101-106)

src/utils/suid.py (1)

get_suid (6-12)

src/utils/types.py (2)

GraniteToolParser (23-37)

get_parser (33-37)

🔇 Additional comments (10)

src/utils/endpoints.py (2)

3-13: LGTM!

The new imports support the async migration well. The suppress import is appropriately used for graceful exception handling when checking for existing agents.

19-59: LGTM!

The parameter name changes from configuration to config improve consistency and readability without affecting functionality.

tests/unit/app/endpoints/test_authorized.py (4)

3-16: LGTM!

The imports and async test conversion are appropriate for testing the async endpoint handler. The MOCK_AUTH constant provides consistent test data.

24-41: Clarify the test scenario validity.

The comment states that "FastAPI's dependency injection would prevent the handler from being called if auth fails," but then tests scenarios where invalid auth data reaches the handler. This seems contradictory.

If the dependency injection truly prevents invalid auth from reaching the handler, these tests might be testing unrealistic scenarios. Consider:

Testing the auth dependency itself instead of the handler

Documenting when/why invalid auth data might bypass dependency injection

Testing the actual FastAPI integration with invalid auth

43-62: LGTM!

These tests appropriately validate the extract_user_token utility function, covering the realistic scenarios where authorization headers are missing or malformed. The exception handling and assertions are correct.

64-92: LGTM!

This test appropriately simulates the auth dependency behavior and validates the expected 403 Forbidden exception. This approach aligns better with the actual authentication flow.

tests/unit/app/endpoints/test_query.py (4)

100-134: LGTM!

The async conversion is properly implemented with AsyncMock for the client and correct patching of AsyncLlamaStackClientHolder.get_client. The await statements are correctly placed.

358-397: LGTM!

The async test conversion is consistently applied with proper @pytest.mark.asyncio decorators and await statements. The test logic and assertions are preserved correctly.

1067-1089: LGTM!

The connection error test is properly converted to async with correct mock patching and exception handling. The test maintains its original intent while working with the async architecture.

1-1322: Comprehensive async conversion completed successfully.

The entire test suite has been consistently converted to async patterns with:

Proper @pytest.mark.asyncio decorators

Correct use of AsyncMock for async operations

Updated patching targets for async client holders

Preserved test logic and coverage

The removal of get_agent tests aligns with the centralization of that functionality in utils.endpoints.

src/utils/endpoints.py

tisnik

it looks ok!

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

src/utils/endpoints.py (1)

91-94: Direct assignment acknowledged - SDK workaround.

As noted in the retrieved learnings, this direct assignment to agent._agent_id is a necessary workaround for the missing agent rehydration feature in the LLS client SDK. The existing comments and type ignores properly document this exception.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b20317 and a432dcb.

📒 Files selected for processing (22)

scripts/generate_openapi_schema.py (1 hunks)
src/app/endpoints/authorized.py (2 hunks)
src/app/endpoints/conversations.py (5 hunks)
src/app/endpoints/models.py (3 hunks)
src/app/endpoints/query.py (8 hunks)
src/app/endpoints/streaming_query.py (2 hunks)
src/client.py (1 hunks)
src/lightspeed_stack.py (1 hunks)
src/metrics/utils.py (2 hunks)
src/models/config.py (1 hunks)
src/utils/common.py (2 hunks)
src/utils/endpoints.py (2 hunks)
tests/unit/app/endpoints/test_authorized.py (1 hunks)
tests/unit/app/endpoints/test_conversations.py (6 hunks)
tests/unit/app/endpoints/test_models.py (8 hunks)
tests/unit/app/endpoints/test_query.py (36 hunks)
tests/unit/app/endpoints/test_streaming_query.py (1 hunks)
tests/unit/conftest.py (1 hunks)
tests/unit/metrics/test_utis.py (1 hunks)
tests/unit/test_client.py (1 hunks)
tests/unit/utils/test_common.py (5 hunks)
tests/unit/utils/test_endpoints.py (3 hunks)

🚧 Files skipped from review as they are similar to previous changes (18)

src/models/config.py
tests/unit/utils/test_common.py
src/lightspeed_stack.py
src/client.py
scripts/generate_openapi_schema.py
src/app/endpoints/models.py
tests/unit/metrics/test_utis.py
tests/unit/app/endpoints/test_models.py
tests/unit/app/endpoints/test_conversations.py
tests/unit/test_client.py
tests/unit/app/endpoints/test_authorized.py
tests/unit/conftest.py
src/app/endpoints/authorized.py
src/metrics/utils.py
src/app/endpoints/conversations.py
src/app/endpoints/streaming_query.py
src/utils/common.py
tests/unit/utils/test_endpoints.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: the direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for...

Learnt from: eranco74
PR: lightspeed-core/lightspeed-stack#348
File: src/utils/endpoints.py:91-94
Timestamp: 2025-08-06T06:02:21.049Z
Learning: The direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for the missing agent rehydration feature in the LLS client SDK. This allows preserving conversation IDs when handling existing agents.

Applied to files:

src/utils/endpoints.py
src/app/endpoints/query.py
tests/unit/app/endpoints/test_query.py
tests/unit/app/endpoints/test_streaming_query.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-pr
GitHub Check: e2e_tests

🔇 Additional comments (21)

src/utils/endpoints.py (4)

3-13: LGTM - Clean async imports and utilities.

The imports are well-organized with proper async client classes and utility imports. The addition of contextlib.suppress and logging support aligns with the async refactor.

19-19: Parameter rename improves consistency.

The parameter name change from configuration to config creates consistency across the codebase.

28-59: Enhanced error handling for system prompt customization.

The function now properly validates system prompt usage when customization is disabled, providing clear error messages. The parameter rename maintains consistency.

62-102: Ensure sessions_response.data Always Contains At Least One Session
Accessing sessions_response.data[0] in src/utils/endpoints.py (Line 95) assumes the API will never return an empty list. While your tests mock a non‐empty response, there’s no handling for an empty data array. If the LlamaStack API ever returns data=[] instead of raising an error, this will throw an IndexError.

Please verify the API’s contract or add fallback logic, for example:

Raise a clear exception if data is empty.

Default to creating a new session when no previous sessions exist.

src/app/endpoints/query.py (9)

11-11: Clean async client import.

The import change to AsyncLlamaStackClient aligns with the async refactor.

21-29: Proper async dependencies and imports.

The imports correctly reference the async client holder and the centralized get_agent function from utils.endpoints.

69-73: Function signature properly converted to async.

The endpoint handler is correctly converted to an async function, maintaining the same return type and parameters.

84-94: Async client usage and proper awaiting.

All client operations are properly awaited, including models.list() and retrieve_response(). The async pattern is correctly implemented.

208-209: Function signature converted to async with proper type hints.

The retrieve_response function is correctly converted to async with proper parameter types.

217-223: Shield retrieval properly awaited.

Shield listing operations are correctly awaited and the filtering logic remains intact.

241-251: Centralized agent creation with proper logging.

The switch to using the imported get_agent function centralizes the logic and the added logging for conversation and session IDs improves observability.

276-278: Vector DB listing properly awaited.

The vector database list operation is correctly awaited.

286-292: Agent turn creation properly awaited.

The agent.create_turn operation is correctly awaited with all proper parameters.

tests/unit/app/endpoints/test_query.py (7)

59-73: Test properly converted to async with correct exception handling.

The test is correctly decorated with @pytest.mark.asyncio, the function is async, and the assertion placement after the exception context is correct (this was previously an issue that has been addressed).

97-134: Helper function converted to async with proper mocking.

The _test_query_endpoint_handler function is properly converted to async with AsyncMock for the client and correct awaiting of the endpoint handler call.

160-169: Test functions properly decorated and awaited.

Both test functions are correctly decorated with @pytest.mark.asyncio and properly await the helper function.

358-397: Retrieve response tests converted to async.

All the retrieve_response tests are properly converted to async with correct decorators and awaiting. The mock setup uses the centralized prepare_agent_mocks fixture.

1067-1088: Connection error test properly converted.

The test correctly uses AsyncMock and patches the async client holder, with proper awaiting of the endpoint handler.

1099-1105: Mock client setup uses AsyncMock correctly.

The mock client is properly set up as an AsyncMock and the patch targets the async client holder.

1127-1199: No-tools tests properly converted to async.

Both no-tools test variants are correctly converted with async decorators, proper mock setup, and awaited endpoint calls.

tests/unit/app/endpoints/test_streaming_query.py (1)

143-143: LGTM: Async client migration implemented correctly.

The test correctly uses AsyncLlamaStackClientHolder.get_client which aligns with the overall async architecture migration described in the PR objectives.

coderabbitai · 2025-08-06T10:39:34Z

tests/unit/app/endpoints/test_streaming_query.py

    mock_client = mocker.AsyncMock()
    mock_client.models.side_effect = APIConnectionError(request=query_request)
-    mock_lsc = mocker.patch("client.LlamaStackClientHolder.get_client")
+    mock_lsc = mocker.patch("client.AsyncLlamaStackClientHolder.get_client")
    mock_lsc.return_value = mock_client
    mock_async_lsc = mocker.patch("client.AsyncLlamaStackClientHolder.get_client")
    mock_async_lsc.return_value = mock_client


🛠️ Refactor suggestion

Remove redundant client mocking.

The same client is being patched twice with identical return values, which is redundant and potentially confusing.

Apply this diff to remove the duplicate patch:

# simulate situation when it is not possible to connect to Llama Stack mock_client = mocker.AsyncMock() mock_client.models.side_effect = APIConnectionError(request=query_request) -mock_lsc = mocker.patch("client.AsyncLlamaStackClientHolder.get_client") -mock_lsc.return_value = mock_client mock_async_lsc = mocker.patch("client.AsyncLlamaStackClientHolder.get_client") mock_async_lsc.return_value = mock_client

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

mock_client = mocker.AsyncMock()

mock_client.models.side_effect = APIConnectionError(request=query_request)

mock_lsc = mocker.patch("client.LlamaStackClientHolder.get_client")

mock_lsc = mocker.patch("client.AsyncLlamaStackClientHolder.get_client")

mock_lsc.return_value = mock_client

mock_async_lsc = mocker.patch("client.AsyncLlamaStackClientHolder.get_client")

mock_async_lsc.return_value = mock_client

# simulate situation when it is not possible to connect to Llama Stack

mock_client = mocker.AsyncMock()

mock_client.models.side_effect = APIConnectionError(request=query_request)

mock_async_lsc = mocker.patch("client.AsyncLlamaStackClientHolder.get_client")

mock_async_lsc.return_value = mock_client

🤖 Prompt for AI Agents

In tests/unit/app/endpoints/test_streaming_query.py around lines 141 to 146, there are two identical patches for AsyncLlamaStackClientHolder.get_client returning the same mock client, which is redundant. Remove one of the duplicate mocker.patch calls so that AsyncLlamaStackClientHolder.get_client is patched only once with the mock_client as its return value.

- Migrate all sync endpoints to async handlers (query, conversations, models, etc.) - Replace sync LlamaStackClient with AsyncLlamaStackClient throughout - Implement proper agent initialization with await agent.initialize() - Centralize test fixtures in conftest.py and fix agent mocking - Remove legacy sync client infrastructure This resolves blocking behavior in all endpoints except streaming_query which was already async, enabling proper concurrent request handling. Signed-off-by: Eran Cohen <[email protected]>

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (3)

tests/unit/test_client.py (3)
43-43: Missing @pytest.mark.asyncio decorator for async test.

This async test function is missing the required pytest decorator.

Apply this diff to fix the issue:
+@pytest.mark.asyncio
 async def test_get_async_llama_stack_remote_client() -> None:
55-57: Inconsistent client handling compared to library mode test.

The remote client test doesn't use an async context manager or properly close the client, unlike the library mode test (lines 36-40). This could lead to resource leaks.

Consider using the same pattern as the library mode test:
-    ls_client = client.get_client()
-    assert ls_client is not None
+    async with client.get_client() as ls_client:
+        assert ls_client is not None
+        assert not ls_client.is_closed()
+        await ls_client.close()
+        assert ls_client.is_closed()
59-59: Missing @pytest.mark.asyncio decorator for async test.

This async test function is missing the required pytest decorator.

Apply this diff to fix the issue:
+@pytest.mark.asyncio
 async def test_get_async_llama_stack_wrong_configuration() -> None:

♻️ Duplicate comments (1)

src/utils/endpoints.py (1)

91-97: Agent ID workaround is documented in learnings.

The direct assignment to agent._agent_id is a necessary workaround for the missing agent rehydration feature in the LLS client SDK, as documented in the retrieved learnings. The implementation correctly handles orphan agent cleanup and session retrieval.

🧹 Nitpick comments (1)

tests/unit/app/endpoints/test_authorized.py (1)
24-41: Test name is misleading - not testing authorization failures.

This test is checking malformed input scenarios (TypeError/ValueError) rather than actual authorization failures. The test name test_authorized_unauthorized suggests it's testing unauthorized access, but it's actually testing programming errors that shouldn't occur in normal operation.

Consider renaming to better reflect what's being tested:
-async def test_authorized_unauthorized():
-    """Test the authorized endpoint handler behavior under unauthorized conditions.
+async def test_authorized_malformed_input():
+    """Test the authorized endpoint handler behavior with malformed auth data.
The actual authorization failures are properly tested in test_authorized_dependency_unauthorized.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2796dc8 and 83b4fac.

📒 Files selected for processing (22)

scripts/generate_openapi_schema.py (1 hunks)
src/app/endpoints/authorized.py (2 hunks)
src/app/endpoints/conversations.py (5 hunks)
src/app/endpoints/models.py (3 hunks)
src/app/endpoints/query.py (8 hunks)
src/app/endpoints/streaming_query.py (2 hunks)
src/client.py (1 hunks)
src/lightspeed_stack.py (1 hunks)
src/metrics/utils.py (2 hunks)
src/models/config.py (1 hunks)
src/utils/common.py (2 hunks)
src/utils/endpoints.py (2 hunks)
tests/unit/app/endpoints/test_authorized.py (1 hunks)
tests/unit/app/endpoints/test_conversations.py (6 hunks)
tests/unit/app/endpoints/test_models.py (8 hunks)
tests/unit/app/endpoints/test_query.py (36 hunks)
tests/unit/app/endpoints/test_streaming_query.py (1 hunks)
tests/unit/conftest.py (1 hunks)
tests/unit/metrics/test_utis.py (1 hunks)
tests/unit/test_client.py (1 hunks)
tests/unit/utils/test_common.py (5 hunks)
tests/unit/utils/test_endpoints.py (3 hunks)

✅ Files skipped from review due to trivial changes (1)

scripts/generate_openapi_schema.py

🚧 Files skipped from review as they are similar to previous changes (18)

tests/unit/metrics/test_utis.py
src/metrics/utils.py
src/lightspeed_stack.py
tests/unit/utils/test_common.py
src/client.py
src/models/config.py
tests/unit/app/endpoints/test_models.py
src/app/endpoints/models.py
src/app/endpoints/streaming_query.py
src/app/endpoints/authorized.py
tests/unit/conftest.py
tests/unit/utils/test_endpoints.py
src/app/endpoints/conversations.py
tests/unit/app/endpoints/test_streaming_query.py
src/utils/common.py
tests/unit/app/endpoints/test_conversations.py
tests/unit/app/endpoints/test_query.py
src/app/endpoints/query.py

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: the direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for...

Learnt from: eranco74
PR: lightspeed-core/lightspeed-stack#348
File: src/utils/endpoints.py:91-94
Timestamp: 2025-08-06T06:02:21.060Z
Learning: The direct assignment to `agent._agent_id` in `src/utils/endpoints.py` is a necessary workaround for the missing agent rehydration feature in the LLS client SDK. This allows preserving conversation IDs when handling existing agents.

Applied to files:

src/utils/endpoints.py

🧬 Code Graph Analysis (1)

tests/unit/app/endpoints/test_authorized.py (2)

src/app/endpoints/authorized.py (1)

authorized_endpoint_handler (33-39)

src/auth/utils.py (1)

extract_user_token (7-26)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build-pr

🔇 Additional comments (9)

src/utils/endpoints.py (8)

3-13: LGTM! Import changes support async migration.

The import changes correctly introduce the async client and agent classes, along with necessary utilities for the new async functionality.

19-21: LGTM! Parameter rename improves consistency.

The parameter rename from configuration to config improves consistency with the codebase.

28-59: LGTM! Good UX improvement with system prompt validation.

The parameter rename and new HTTP 422 error handling for disabled system prompts improves user experience by providing clear feedback when system prompt customization is disabled.

63-72: LGTM! Well-documented async function signature.

The function signature is well-designed with clear parameter types and comprehensive docstring. The async implementation correctly centralizes agent management logic.

74-78: Good error handling with contextlib.suppress.

Using suppress(ValueError) is appropriate here as it allows graceful fallback to creating a new agent when retrieval fails, without cluttering the code with try-catch blocks.

79-89: LGTM! Proper async agent initialization.

The AsyncAgent creation and initialization is correctly implemented with appropriate configuration for shields, tool parser, and session persistence.

98-102: LGTM! Proper session management for new agents.

The session creation logic correctly uses the agent's ID as conversation ID and creates a new session with a unique identifier.

63-102: Consider the function complexity but centralization benefits outweigh concerns.

While this function handles multiple responsibilities (agent retrieval, creation, session management), the centralization provides significant benefits by eliminating code duplication across endpoints and ensuring consistent agent/session handling. The async implementation is correct and well-documented.

tests/unit/app/endpoints/test_authorized.py (1)

43-61: Good implementation of auth dependency testing.

This test properly validates the auth dependency behavior by testing the extract_user_token utility function with various failure scenarios. It addresses the previous review comment about testing actual endpoint behavior under unauthorized conditions.

manstis approved these changes Aug 5, 2025

View reviewed changes

eranco74 force-pushed the MGMT-21377 branch from f3774dc to 6db5d7f Compare August 5, 2025 12:17

ldjebran approved these changes Aug 5, 2025

View reviewed changes

eranco74 force-pushed the MGMT-21377 branch from 6db5d7f to 9a91944 Compare August 5, 2025 13:57

eranco74 force-pushed the MGMT-21377 branch 3 times, most recently from 8e2d70b to 3fc4cc2 Compare August 5, 2025 17:09

eranco74 marked this pull request as ready for review August 5, 2025 17:14

coderabbitai bot reviewed Aug 5, 2025

View reviewed changes

eranco74 force-pushed the MGMT-21377 branch from 3fc4cc2 to a2d045d Compare August 5, 2025 17:24

coderabbitai bot reviewed Aug 5, 2025

View reviewed changes

tests/unit/utils/test_endpoints.py Show resolved Hide resolved

eranco74 force-pushed the MGMT-21377 branch from a2d045d to ed63513 Compare August 5, 2025 17:49

coderabbitai bot reviewed Aug 5, 2025

View reviewed changes

tests/unit/conftest.py Outdated Show resolved Hide resolved

eranco74 force-pushed the MGMT-21377 branch 2 times, most recently from a91cfa7 to 5d2d7ce Compare August 5, 2025 18:58

eranco74 mentioned this pull request Aug 5, 2025

LCORE-470: initalize agent in query #351

Merged

eranco74 force-pushed the MGMT-21377 branch from 5d2d7ce to 8b20317 Compare August 5, 2025 19:29

coderabbitai bot reviewed Aug 5, 2025

View reviewed changes

src/utils/endpoints.py Show resolved Hide resolved

maorfr approved these changes Aug 6, 2025

View reviewed changes

tisnik approved these changes Aug 6, 2025

View reviewed changes

eranco74 force-pushed the MGMT-21377 branch from 8b20317 to a432dcb Compare August 6, 2025 10:36

coderabbitai bot reviewed Aug 6, 2025

View reviewed changes

eranco74 force-pushed the MGMT-21377 branch from 2796dc8 to 83b4fac Compare August 6, 2025 10:59

coderabbitai bot reviewed Aug 6, 2025

View reviewed changes

tisnik merged commit 739b786 into lightspeed-core:main Aug 6, 2025
18 checks passed

coderabbitai bot mentioned this pull request Aug 18, 2025

implements referenced documents on /query and updates /streaming_query to match #403

Open

18 tasks

This was referenced Aug 19, 2025

Role-based authorization layer #356

Merged

Add tool calls to stored transcripts #404

Merged

coderabbitai bot mentioned this pull request Sep 2, 2025

[RHDHPAI-976] Add customization profiles via path #487

Merged

18 tasks

This was referenced Sep 15, 2025

[RHDHPAI-1097]Question validation #541

Closed

[RHDHPAI-978] Topic summary of initial query #564

Merged

This was referenced Sep 29, 2025

Responses v2 support #540

Closed

LCORE-455: Removed global vars to make compatible with uvicorn workers > 1 #604

Merged

This was referenced Oct 9, 2025

LCORE-393: Providers endpoint #641

Merged

LCORE-740: fixed types in test_query.py #665

Merged

This was referenced Oct 19, 2025

LCORE-739: type hint for mocker fixture #691

Merged

[RHDHPAI-1143] Implement referenced_documents caching #643

Merged

LCORE-417 Convert unittest mocking to pytest mocking #704

Merged

This was referenced Nov 4, 2025

Base implementation of streaming Responses API #754

Merged

LCORE-410: Model quota limit exception handling #772

Merged

Convert lightspeed-core to async architecture #348

Convert lightspeed-core to async architecture #348

Uh oh!

Conversation

eranco74 commented Aug 5, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

manstis left a comment

Choose a reason for hiding this comment

Uh oh!

manstis commented Aug 5, 2025

Uh oh!

ldjebran left a comment

Choose a reason for hiding this comment

Uh oh!

eranco74 commented Aug 5, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 6, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

eranco74 commented Aug 5, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 5, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)