fix: Migrate from deprecated vector_dbs to vector_stores API #678

luis5tb · 2025-10-15T13:06:11Z

Description

This PR updates the codebase to use the new vector_stores API instead of the deprecated vector_dbs API, which was removed in llama-stack PR #3774.

Type of change

Related Tickets & Documents

Related Issue [BUG] VectorDB API removed on llamastack #677

Summary by CodeRabbit

Refactor
- Backend now consistently uses "vector store" identifiers instead of "vector database" identifiers across query and streaming flows.
- RAG toolgroup selection and construction updated to accept and pass vector store IDs; streaming flow also includes server names when building toolgroups.
- Conditional handling preserved so toolgroups remain disabled when no IDs are available.
Tests
- Unit tests updated to reflect the new vector store data structures and access patterns.

coderabbitai · 2025-10-15T13:06:39Z

Walkthrough

Replaces vector DB listing with vector store listing in query endpoints; updates get_rag_toolgroups signature and internals to accept vector_store_ids; streaming and non-streaming flows derive IDs from client.vector_stores.list() and tests/mocks updated to the .data response shape and id attribute.

Changes

Cohort / File(s)	Summary
RAG toolgroup parameter migration `src/app/endpoints/query.py`	Signature and internals updated: `vector_db_ids` → `vector_store_ids`; docstring, conditional checks, and ToolgroupAgentToolGroupWithArgs usage now pass `vector_store_ids`.
Streaming endpoint alignment `src/app/endpoints/streaming_query.py`	Replaced `vector_dbs.list()` usage with `vector_stores.list()`; compute `vector_store_ids` from response `.data`, pass to `get_rag_toolgroups`, preserve empty-toolgroup → `None` behavior.
Unit test updates `tests/unit/app/endpoints/test_query.py`, `tests/unit/app/endpoints/test_streaming_query.py`	Tests updated to mock `client.vector_stores.list()` returning a wrapper with `.data` list of stores exposing `id`; local vars renamed to `vector_store_ids`; assertions and mocks adjusted to use `id` and `.data`.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client as HTTP Client
  participant Endpoint as query / streaming_query
  participant SDK as API Client
  participant RAG as get_rag_toolgroups
  participant Agent as Toolgroup Agent

  Client->>Endpoint: Request (query / streaming)
  Endpoint->>SDK: client.vector_stores.list()
  note right of SDK `#D6EAF8`: returns response object with `.data` list
  SDK-->>Endpoint: list response (data -> vector_store objects)
  Endpoint->>Endpoint: extract vector_store_ids from .data (ids)
  Endpoint->>RAG: get_rag_toolgroups(vector_store_ids)
  RAG-->>Endpoint: Toolgroups or None
  Endpoint->>Agent: Build/run agent with toolgroups
  Agent-->>Client: Response / stream

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Verify all renames from vector_db_ids → vector_store_ids and corresponding docstring/type updates.
Confirm client.vector_stores.list() usage and that tests mock the .data wrapper consistently.
Check ToolgroupAgentToolGroupWithArgs argument name changes and empty-toolgroup → None behavior.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: migrating from deprecated vector_dbs API to the new vector_stores API, which is the core objective of the PR.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8fb26da and ad70fef.

📒 Files selected for processing (4)

src/app/endpoints/query.py (2 hunks)
src/app/endpoints/streaming_query.py (1 hunks)
tests/unit/app/endpoints/test_query.py (16 hunks)
tests/unit/app/endpoints/test_streaming_query.py (12 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/unit/app/endpoints/test_streaming_query.py

🧰 Additional context used

🧬 Code graph analysis (2)

src/app/endpoints/streaming_query.py (1)

src/app/endpoints/query.py (1)

get_rag_toolgroups (848-875)

tests/unit/app/endpoints/test_query.py (1)

src/app/endpoints/query.py (1)

get_rag_toolgroups (848-875)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: build-pr
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: e2e_tests (azure)
GitHub Check: e2e_tests (ci)

🔇 Additional comments (5)

src/app/endpoints/query.py (2)

750-752: LGTM! Vector stores API integration is consistent.

The migration from vector_dbs to vector_stores is correctly implemented, matching the pattern in streaming_query.py:

Uses the new .data attribute from the list response

Accesses .id attribute on each vector store

Passes vector_store_ids to get_rag_toolgroups

848-875: Function signature and implementation updated correctly.

The function has been properly migrated:

Parameter renamed from vector_db_ids to vector_store_ids

Docstrings updated to reflect "vector store" terminology

Args dictionary key changed to "vector_store_ids" (line 869)

Return logic unchanged and correct

The past review comment regarding test assertion compatibility was addressed in commit 7fd8a8f.

tests/unit/app/endpoints/test_query.py (2)

535-539: LGTM! Test mocks consistently updated for vector_stores API.

All test mocks have been properly updated to reflect the new API structure:

Changed from mock_client.vector_dbs.list() to mock_client.vector_stores.list()

Response wrapped in a mock object with .data attribute

Vector store objects use .id attribute (e.g., "VectorDB-1")

The mock structure correctly simulates the llama-stack-client API response format.

Also applies to: 575-579, 616-620, 665-667, 723-725, 784-786, 847-849, 904-906, 962-964, 1146-1148, 1229-1231, 1290-1292, 1403-1407, 1672-1676, 1729-1733

1442-1453: LGTM! Test validates the updated function signature and return value.

The test for get_rag_toolgroups correctly:

Uses vector_store_ids as the parameter name (lines 1444, 1448)

Verifies the function returns None for empty lists

Checks the returned toolgroup args contain "vector_store_ids" key (line 1453)

This confirms the function signature migration is complete and correct.

src/app/endpoints/streaming_query.py (1)

1078-1086: All API assumptions verified and correct.

The code properly uses:

.data attribute on the client.vector_stores.list() response (confirmed across 50+ test mocks)

.id attribute on individual vector store objects (confirmed in all test mocks)

"vector_store_ids" as the builtin RAG tool argument key (confirmed in function definition at src/app/endpoints/query.py:848-869 and test assertions)

The API structure matches the llama-stack-client expectations, and the implementation is consistent throughout the codebase.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52c0f6d and a5f1f16.

📒 Files selected for processing (2)

src/app/endpoints/query.py (2 hunks)
src/app/endpoints/streaming_query.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/app/endpoints/streaming_query.py

🧰 Additional context used

📓 Path-based instructions (5)

src/**/*.py