-
Notifications
You must be signed in to change notification settings - Fork 52
LCORE-471: Remove data collector service #377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis PR removes the Data Collector feature across code, configuration, and docs; simplifies startup to always run the web service; updates configuration schemas; and expands OpenAPI with a new conversations endpoint and database configuration (sqlite/postgres). Numerous tests and documentation sections related to Data Collector are deleted. Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI
participant App as lightspeed_stack.py
participant Web as Uvicorn/Web Service
CLI->>App: main(args)
App->>App: parse args (no --data-collector)
alt dump configuration
App-->>CLI: write config and exit
else start service
App->>Web: start_uvicorn()
Web-->>CLI: service running
end
sequenceDiagram
participant Client as Client
participant API as Web Service
participant DB as Database (sqlite/postgres)
Client->>API: GET /v1/conversations
API->>DB: Query user conversations
DB-->>API: Conversations list
API-->>Client: 200 ConversationsListResponse
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🔭 Outside diff range comments (7)
docs/testing.md (1)
68-71: Align integration test coverage threshold in docs with MakefileThe docs in
docs/testing.md(lines 68–71) state that the integration‐test coverage threshold is 60%, but the Makefile actually sets--cov-fail-under=10fortest-integration. Update one or the other to keep them in sync.• File: docs/testing.md (around line 69)
• Makefile (line 21) sets integration threshold to 10%Proposed doc fix:
-Currently code coverage threshold for integration tests is set to 60%. This value is specified directly in Makefile, because the coverage threshold is different from threshold required for unit tests. +Currently code coverage threshold for integration tests is set to 10%. This value is specified directly in Makefile, because the coverage threshold is different from threshold required for unit tests..github/workflows/e2e_tests.yaml (1)
85-93: lightspeed-stack.yaml keys inconsistent with repository/docs (feedback_disabledvsfeedback_enabled)Other configs/docs use
feedback_enabled/transcripts_enabled. Here the workflow writesfeedback_disabled/transcripts_disabled, which may not be recognized by the config model.Apply this diff to align with the rest of the repo:
user_data_collection: - feedback_disabled: false - feedback_storage: "/tmp/data/feedback" - transcripts_disabled: false - transcripts_storage: "/tmp/data/transcripts" + feedback_enabled: true + feedback_storage: "/tmp/data/feedback" + transcripts_enabled: true + transcripts_storage: "/tmp/data/transcripts" authentication: module: "noop"src/lightspeed_stack.py (2)
61-64: Avoid logging full configuration objects (risk of leaking secrets like API keys)
logger.info("Configuration: %s", configuration.configuration)likely prints sensitive fields (e.g.,llama_stack.api_key). Reduce detail or redact.Apply a minimal hardening:
- logger.info("Configuration: %s", configuration.configuration) - logger.info( - "Llama stack configuration: %s", configuration.llama_stack_configuration - ) + # Avoid dumping full configuration at INFO level to prevent secret leakage + logger.debug("Configuration loaded.") + logger.debug("Llama stack configuration loaded.")If you have a redaction-safe serializer, prefer logging that instead.
66-68: Wrap client initialization with error handlingNetwork or config errors will currently bubble up unhandled. Fail fast with a clear message and non-zero exit.
- asyncio.run( - AsyncLlamaStackClientHolder().load(configuration.configuration.llama_stack) - ) + try: + asyncio.run( + AsyncLlamaStackClientHolder().load(configuration.configuration.llama_stack) + ) + except Exception as exc: + logger.error("Failed to initialize Llama Stack client: %s", exc, exc_info=True) + raise SystemExit(1)src/models/config.py (1)
241-246: Avoid assert for runtime validation in jwk_configuration propertyUsing assert may raise AssertionError and get optimized away. Raise a ValueError to keep error semantics consistent.
def jwk_configuration(self) -> JwkConfiguration: """Return JWK configuration if the module is JWK token.""" if self.module != constants.AUTH_MOD_JWK_TOKEN: raise ValueError( "JWK configuration is only available for JWK token authentication module" ) - assert self.jwk_config is not None, "JWK configuration should not be None" - return self.jwk_config + if self.jwk_config is None: + raise ValueError("JWK configuration must be provided for JWK token module") + return self.jwk_configdocs/config.puml (1)
12-22: Add DatabaseConfiguration and DB classes to diagram (docs out of sync with code and OpenAPI)Configuration now includes a database field with SQLite/PostgreSQL options. The diagram should reflect:
- Class DatabaseConfiguration with sqlite/postgres.
- Classes SQLiteDatabaseConfiguration and PostgreSQLDatabaseConfiguration.
- Composition relationships.
Proposed additions:
class "Configuration" as src.models.config.Configuration { authentication : Optional[AuthenticationConfiguration] customization : Optional[Customization] + database inference llama_stack mcp_servers : list[ModelContextProtocolServer] name : str service user_data_collection dump(filename: str) -> None } @@ +class "DatabaseConfiguration" as src.models.config.DatabaseConfiguration { + sqlite : Optional[SQLiteDatabaseConfiguration] + postgres : Optional[PostgreSQLDatabaseConfiguration] + check_database_configuration() -> Self + db_type + config +} +class "SQLiteDatabaseConfiguration" as src.models.config.SQLiteDatabaseConfiguration { + db_path : str +} +class "PostgreSQLDatabaseConfiguration" as src.models.config.PostgreSQLDatabaseConfiguration { + host : str + port : int + db : str + user : str + password : str + namespace : Optional[str] + ssl_mode : str + gss_encmode : str + ca_cert_path : Optional[FilePath] + check_postgres_configuration() -> Self +} @@ +src.models.config.DatabaseConfiguration --* src.models.config.Configuration : database +src.models.config.SQLiteDatabaseConfiguration --* src.models.config.DatabaseConfiguration : sqlite +src.models.config.PostgreSQLDatabaseConfiguration --* src.models.config.DatabaseConfiguration : postgresAlso applies to: 71-77, 79-85
docs/openapi.json (1)
41-55: Invalid OpenAPI response format – extraneousname/versionfields
The spec validator fails because onlydescription,headers,content, andlinksare allowed at the response object level. Move your sample fields under the media‐type’sexample(orexamples) property.Please update
docs/openapi.jsonat the following locations:
- Lines 41–55 (
/v1/infoGET response)- Lines 67–99
- Lines 124–137
- Lines 235–275
- Lines 388–405
- Lines 446–463
- Lines 518–521
- Lines 555–577
- Lines 599–609
- Lines 665–674
Apply a patch similar to:
"responses": { "200": { "description": "Successful Response", "content": { - "application/json": { - "schema": { "$ref": "#/components/schemas/InfoResponse" }, - "name": "Service name", - "version": "Service version" - } + "application/json": { + "schema": { "$ref": "#/components/schemas/InfoResponse" }, + "example": { + "name": "Service name", + "version": "Service version" + } + } } } }After making these changes, rerun your OpenAPI validation to confirm there are no more unevaluated‐properties errors.
🧹 Nitpick comments (11)
README.md (1)
45-45: Add a short migration note for Data CollectorConsider adding a “Where did Data Collector go?” note linking to the new repo and minimal migration steps. This will help users who have old configs/scripts.
.github/workflows/e2e_tests.yaml (1)
4-4: Security: pull_request_target runs with secretsThis workflow uses pull_request_target and provides OPENAI_API_KEY as a job-level secret. Even with persist-credentials off, secrets are accessible to steps that operate on PR-provided code/content (e.g., docker compose). Consider moving to pull_request with a hardened approach (or workflow_run from a trusted branch) and using a scoped token or ephemeral secret provisioning.
docs/getting_started.md (1)
267-270: Auth block addition LGTM; add note on precedence with service.auth_enabledGood addition. Please add a one-liner clarifying how
service.auth_enabledinteracts with the new top-levelauthentication(which one takes precedence), to avoid confusion.lightspeed-stack.yaml (1)
23-26: Top-level authentication block added; clarify/deprecate overlapping auth flagsThis is good. Given
service.auth_enabledstill exists, please document or enforce precedence (e.g., deprecateservice.auth_enabledin favor ofauthentication.module) to avoid dual-source-of-truth confusion.docs/deployment_guide.md (3)
572-575: Auth config snippet updated — add brief usage noteConsider adding one sentence under this snippet explaining what
"noop"does and how to switch providers.
885-887: Auth config snippet updated — keep examples consistentLooks good. Repeat a short note about auth provider selection here for consistency.
1652-1654: Auth block addition LGTM (container flow)Consistent with other examples. A single centralized “Authentication” subsection referenced by these snippets could reduce duplication.
src/lightspeed_stack.py (1)
56-59: Wire the --verbose flag to logging levelVerbose currently doesn’t affect log level. Set root and handler levels to DEBUG when requested.
parser = create_argument_parser() args = parser.parse_args() + if args.verbose: + logging.getLogger().setLevel(logging.DEBUG) + for h in logging.getLogger().handlers: + h.setLevel(logging.DEBUG) + configuration.load_configuration(args.config_file)src/models/config.py (3)
49-55: Tighten PostgreSQL port validation messagesThe logic is correct, but messages can be clearer and aligned with the checks.
- if self.port <= 0: - raise ValueError("Port value should not be negative") - if self.port > 65535: - raise ValueError("Port value should be less than 65536") + if self.port <= 0: + raise ValueError("Port must be greater than 0") + if self.port > 65535: + raise ValueError("Port must be <= 65535")
73-75: Make default SQLite path portable across platformsHardcoding /tmp may not be portable (e.g., Windows) and could be restricted in some environments. Use tempfile.gettempdir().
+import tempfile @@ - sqlite_file_name = "/tmp/lightspeed-stack.db" - self.sqlite = SQLiteDatabaseConfiguration(db_path=sqlite_file_name) + tmp_dir = Path(tempfile.gettempdir()) + sqlite_file_name = tmp_dir / "lightspeed-stack.db" + self.sqlite = SQLiteDatabaseConfiguration(db_path=str(sqlite_file_name))
152-160: Nit: unify “Llama” casing in error messagesMaintain consistent casing (“Llama”) across messages.
- "LLama stack URL is not specified and library client mode is not specified" + "Llama stack URL is not specified and library client mode is not specified" @@ - "LLama stack URL is not specified and library client mode is not enabled" + "Llama stack URL is not specified and library client mode is not enabled" @@ - "LLama stack library client mode is enabled but a configuration file path is not specified" # noqa: E501 + "Llama stack library client mode is enabled but a configuration file path is not specified" # noqa: E501Also applies to: 166-168
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
.github/workflows/e2e_tests.yaml(1 hunks)Makefile(1 hunks)README.md(1 hunks)docs/config.puml(2 hunks)docs/deployment_guide.md(4 hunks)docs/getting_started.md(1 hunks)docs/openapi.json(13 hunks)docs/openapi.md(0 hunks)docs/output.md(0 hunks)docs/testing.md(1 hunks)lightspeed-stack.yaml(1 hunks)src/constants.py(0 hunks)src/lightspeed_stack.py(1 hunks)src/models/config.py(1 hunks)src/runners/data_collector.py(0 hunks)src/services/data_collector.py(0 hunks)tests/unit/models/test_config.py(0 hunks)tests/unit/runners/test_data_collector_runner.py(0 hunks)tests/unit/services/test_data_collector.py(0 hunks)
💤 Files with no reviewable changes (8)
- src/runners/data_collector.py
- tests/unit/runners/test_data_collector_runner.py
- src/constants.py
- docs/openapi.md
- src/services/data_collector.py
- docs/output.md
- tests/unit/models/test_config.py
- tests/unit/services/test_data_collector.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (8)
docs/testing.md (1)
108-110: Unit test tree update for runners looks correctRemoval of data collector runner tests is reflected here. Matches the PR objective.
Makefile (1)
11-11: Removed run-data-collector targetTarget removal aligns with the feature deprecation. README Make targets are consistent.
docs/deployment_guide.md (1)
1497-1499: Auth block addition LGTMMatches the new configuration surface.
src/lightspeed_stack.py (1)
50-50: All data-collector references successfully removedVerified across the repository with:
rg -n --hidden --glob '!*venv*' 'data_collector|start_data_collector|--data-collector'(no matches found)No further action required.
src/models/config.py (2)
42-44: PostgreSQL default constants verified
The constantsPOSTGRES_DEFAULT_SSL_MODE(line 53) andPOSTGRES_DEFAULT_GSS_ENCMODE(line 55) are present insrc/constants.py. No further action required.
1-305: No Data Collector references detected
A repository-wide search for common Data Collector identifiers (data_collector,DataCollector,ingress_server_url,ingress_server_auth_token,ingress_content_service_name,collection_interval,cleanup_after_send,connection_timeout,DATA_COLLECTOR) returned zero matches. All references have been removed.docs/config.puml (1)
29-34: LGTM: InferenceConfiguration method surfacedThe addition of check_default_model_and_provider in the diagram matches the model. No issues.
docs/openapi.json (1)
825-832: LGTM: Configuration now includes database with default SQLiteThis aligns with code defaults and DatabaseConfiguration schema.
| "/v1/conversations": { | ||
| "get": { | ||
| "tags": [ | ||
| "conversations" | ||
| ], | ||
| "summary": "Get Conversations List Endpoint Handler", | ||
| "description": "Handle request to retrieve all conversations for the authenticated user.", | ||
| "operationId": "get_conversations_list_endpoint_handler_v1_conversations_get", | ||
| "responses": { | ||
| "200": { | ||
| "description": "Successful Response", | ||
| "content": { | ||
| "application/json": { | ||
| "schema": { | ||
| "$ref": "#/components/schemas/ConversationsListResponse" | ||
| } | ||
| } | ||
| }, | ||
| "conversations": [ | ||
| { | ||
| "conversation_id": "123e4567-e89b-12d3-a456-426614174000", | ||
| "created_at": "2024-01-01T00:00:00Z", | ||
| "last_message_at": "2024-01-01T00:05:00Z", | ||
| "last_used_model": "gemini/gemini-1.5-flash", | ||
| "last_used_provider": "gemini", | ||
| "message_count": 5 | ||
| }, | ||
| { | ||
| "conversation_id": "456e7890-e12b-34d5-a678-901234567890", | ||
| "created_at": "2024-01-01T01:00:00Z", | ||
| "last_message_at": "2024-01-01T01:02:00Z", | ||
| "last_used_model": "gemini/gemini-2.0-flash", | ||
| "last_used_provider": "gemini", | ||
| "message_count": 2 | ||
| } | ||
| ] | ||
| }, | ||
| "503": { | ||
| "description": "Service Unavailable", | ||
| "detail": { | ||
| "response": "Unable to connect to Llama Stack", | ||
| "cause": "Connection error." | ||
| } | ||
| } | ||
| } | ||
| } | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add 400/403 responses to /v1/conversations (endpoint states “authenticated user”)
The description says it returns conversations for the authenticated user, but only 200/503 are documented. Include 400 (missing/invalid credentials) and 403 (unauthorized) like other endpoints.
"/v1/conversations": {
"get": {
"tags": ["conversations"],
"summary": "Get Conversations List Endpoint Handler",
"description": "Handle request to retrieve all conversations for the authenticated user.",
"operationId": "get_conversations_list_endpoint_handler_v1_conversations_get",
"responses": {
"200": { ... },
+ "400": {
+ "description": "Missing or invalid credentials provided by client",
+ "content": {
+ "application/json": {
+ "schema": { "$ref": "#/components/schemas/UnauthorizedResponse" }
+ }
+ }
+ },
+ "403": {
+ "description": "User is not authorized",
+ "content": {
+ "application/json": {
+ "schema": { "$ref": "#/components/schemas/ForbiddenResponse" }
+ }
+ }
+ },
"503": { ... }
}
}
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "/v1/conversations": { | |
| "get": { | |
| "tags": [ | |
| "conversations" | |
| ], | |
| "summary": "Get Conversations List Endpoint Handler", | |
| "description": "Handle request to retrieve all conversations for the authenticated user.", | |
| "operationId": "get_conversations_list_endpoint_handler_v1_conversations_get", | |
| "responses": { | |
| "200": { | |
| "description": "Successful Response", | |
| "content": { | |
| "application/json": { | |
| "schema": { | |
| "$ref": "#/components/schemas/ConversationsListResponse" | |
| } | |
| } | |
| }, | |
| "conversations": [ | |
| { | |
| "conversation_id": "123e4567-e89b-12d3-a456-426614174000", | |
| "created_at": "2024-01-01T00:00:00Z", | |
| "last_message_at": "2024-01-01T00:05:00Z", | |
| "last_used_model": "gemini/gemini-1.5-flash", | |
| "last_used_provider": "gemini", | |
| "message_count": 5 | |
| }, | |
| { | |
| "conversation_id": "456e7890-e12b-34d5-a678-901234567890", | |
| "created_at": "2024-01-01T01:00:00Z", | |
| "last_message_at": "2024-01-01T01:02:00Z", | |
| "last_used_model": "gemini/gemini-2.0-flash", | |
| "last_used_provider": "gemini", | |
| "message_count": 2 | |
| } | |
| ] | |
| }, | |
| "503": { | |
| "description": "Service Unavailable", | |
| "detail": { | |
| "response": "Unable to connect to Llama Stack", | |
| "cause": "Connection error." | |
| } | |
| } | |
| } | |
| } | |
| }, | |
| "/v1/conversations": { | |
| "get": { | |
| "tags": [ | |
| "conversations" | |
| ], | |
| "summary": "Get Conversations List Endpoint Handler", | |
| "description": "Handle request to retrieve all conversations for the authenticated user.", | |
| "operationId": "get_conversations_list_endpoint_handler_v1_conversations_get", | |
| "responses": { | |
| "200": { | |
| "description": "Successful Response", | |
| "content": { | |
| "application/json": { | |
| "schema": { | |
| "$ref": "#/components/schemas/ConversationsListResponse" | |
| } | |
| } | |
| }, | |
| "conversations": [ | |
| { | |
| "conversation_id": "123e4567-e89b-12d3-a456-426614174000", | |
| "created_at": "2024-01-01T00:00:00Z", | |
| "last_message_at": "2024-01-01T00:05:00Z", | |
| "last_used_model": "gemini/gemini-1.5-flash", | |
| "last_used_provider": "gemini", | |
| "message_count": 5 | |
| }, | |
| { | |
| "conversation_id": "456e7890-e12b-34d5-a678-901234567890", | |
| "created_at": "2024-01-01T01:00:00Z", | |
| "last_message_at": "2024-01-01T01:02:00Z", | |
| "last_used_model": "gemini/gemini-2.0-flash", | |
| "last_used_provider": "gemini", | |
| "message_count": 2 | |
| } | |
| ] | |
| }, | |
| "400": { | |
| "description": "Missing or invalid credentials provided by client", | |
| "content": { | |
| "application/json": { | |
| "schema": { | |
| "$ref": "#/components/schemas/UnauthorizedResponse" | |
| } | |
| } | |
| } | |
| }, | |
| "403": { | |
| "description": "User is not authorized", | |
| "content": { | |
| "application/json": { | |
| "schema": { | |
| "$ref": "#/components/schemas/ForbiddenResponse" | |
| } | |
| } | |
| } | |
| }, | |
| "503": { | |
| "description": "Service Unavailable", | |
| "detail": { | |
| "response": "Unable to connect to Llama Stack", | |
| "cause": "Connection error." | |
| } | |
| } | |
| } | |
| } | |
| }, |
🤖 Prompt for AI Agents
In docs/openapi.json around lines 370 to 416, the /v1/conversations GET
currently documents only 200 and 503 responses but the endpoint is for an
authenticated user; add standard 400 and 403 responses consistent with other
endpoints by adding a 400 response for missing/invalid credentials (brief
description and reference to the existing error schema or inline error object)
and a 403 response for unauthorized access (description and same error schema
reference), ensuring the response objects match the OpenAPI structure used
elsewhere (content/application/json with the proper error schema $ref or
fields).
| "ConversationDetails": { | ||
| "properties": { | ||
| "conversation_id": { | ||
| "type": "string", | ||
| "title": "Conversation Id" | ||
| }, | ||
| "created_at": { | ||
| "anyOf": [ | ||
| { | ||
| "type": "string" | ||
| }, | ||
| { | ||
| "type": "null" | ||
| } | ||
| ], | ||
| "title": "Created At" | ||
| }, | ||
| "last_message_at": { | ||
| "anyOf": [ | ||
| { | ||
| "type": "string" | ||
| }, | ||
| { | ||
| "type": "null" | ||
| } | ||
| ], | ||
| "title": "Last Message At" | ||
| }, | ||
| "message_count": { | ||
| "anyOf": [ | ||
| { | ||
| "type": "integer" | ||
| }, | ||
| { | ||
| "type": "null" | ||
| } | ||
| ], | ||
| "title": "Message Count" | ||
| }, | ||
| "last_used_model": { | ||
| "anyOf": [ | ||
| { | ||
| "type": "string" | ||
| }, | ||
| { | ||
| "type": "null" | ||
| } | ||
| ], | ||
| "title": "Last Used Model" | ||
| }, | ||
| "last_used_provider": { | ||
| "anyOf": [ | ||
| { | ||
| "type": "string" | ||
| }, | ||
| { | ||
| "type": "null" | ||
| } | ||
| ], | ||
| "title": "Last Used Provider" | ||
| } | ||
| }, | ||
| "type": "object", | ||
| "required": [ | ||
| "conversation_id" | ||
| ], | ||
| "title": "ConversationDetails", | ||
| "description": "Model representing the details of a user conversation.\n\nAttributes:\n conversation_id: The conversation ID (UUID).\n created_at: When the conversation was created.\n last_message_at: When the last message was sent.\n message_count: Number of user messages in the conversation.\n model: The model used for the conversation.\n\nExample:\n ```python\n conversation = ConversationSummary(\n conversation_id=\"123e4567-e89b-12d3-a456-426614174000\"\n created_at=\"2024-01-01T00:00:00Z\",\n last_message_at=\"2024-01-01T00:05:00Z\",\n message_count=5,\n model=\"gemini/gemini-2.0-flash\"\n )\n ```" | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix ConversationDetails description: use last_used_model/last_used_provider; update examples
The schema fields are last_used_model and last_used_provider, but the description refers to model and examples omit last_used_provider.
"ConversationDetails": {
"properties": { ... },
"type": "object",
"required": ["conversation_id"],
"title": "ConversationDetails",
- "description": "Model representing the details of a user conversation.\n\nAttributes:\n conversation_id: The conversation ID (UUID).\n created_at: When the conversation was created.\n last_message_at: When the last message was sent.\n message_count: Number of user messages in the conversation.\n model: The model used for the conversation.\n\nExample:\n ```python\n conversation = ConversationSummary(\n conversation_id=\"123e4567-e89b-12d3-a456-426614174000\"\n created_at=\"2024-01-01T00:00:00Z\",\n last_message_at=\"2024-01-01T00:05:00Z\",\n message_count=5,\n model=\"gemini/gemini-2.0-flash\"\n )\n ```"
+ "description": "Model representing the details of a user conversation.\n\nAttributes:\n conversation_id: The conversation ID (UUID).\n created_at: When the conversation was created.\n last_message_at: When the last message was sent.\n message_count: Number of user messages in the conversation.\n last_used_model: The last model used in the conversation.\n last_used_provider: The provider of the last model.\n\nExample:\n ```python\n conversation = ConversationDetails(\n conversation_id=\"123e4567-e89b-12d3-a456-426614174000\",\n created_at=\"2024-01-01T00:00:00Z\",\n last_message_at=\"2024-01-01T00:05:00Z\",\n message_count=5,\n last_used_model=\"gemini/gemini-2.0-flash\",\n last_used_provider=\"gemini\"\n )\n ```"
},🤖 Prompt for AI Agents
In docs/openapi.json around lines 911 to 979, the ConversationDetails
description and example refer to "model" and omit "last_used_provider" while the
schema uses last_used_model and last_used_provider; update the description to
mention last_used_model and last_used_provider and change the example to
construct a ConversationDetails object (not ConversationSummary) including both
last_used_model and last_used_provider with appropriate sample values, and
ensure the attribute names in the docstring match the schema exactly.
| "ConversationsListResponse": { | ||
| "properties": { | ||
| "conversations": { | ||
| "items": { | ||
| "$ref": "#/components/schemas/ConversationDetails" | ||
| }, | ||
| "type": "array", | ||
| "title": "Conversations" | ||
| } | ||
| }, | ||
| "type": "object", | ||
| "required": [ | ||
| "conversations" | ||
| ], | ||
| "title": "ConversationsListResponse", | ||
| "description": "Model representing a response for listing conversations of a user.\n\nAttributes:\n conversations: List of conversation details associated with the user.\n\nExample:\n ```python\n conversations_list = ConversationsListResponse(\n conversations=[\n ConversationDetails(\n conversation_id=\"123e4567-e89b-12d3-a456-426614174000\",\n created_at=\"2024-01-01T00:00:00Z\",\n last_message_at=\"2024-01-01T00:05:00Z\",\n message_count=5,\n model=\"gemini/gemini-2.0-flash\"\n ),\n ConversationDetails(\n conversation_id=\"456e7890-e12b-34d5-a678-901234567890\"\n created_at=\"2024-01-01T01:00:00Z\",\n message_count=2,\n model=\"gemini/gemini-2.5-flash\"\n )\n ]\n )\n ```", | ||
| "examples": [ | ||
| { | ||
| "conversations": [ | ||
| { | ||
| "conversation_id": "123e4567-e89b-12d3-a456-426614174000", | ||
| "created_at": "2024-01-01T00:00:00Z", | ||
| "last_message_at": "2024-01-01T00:05:00Z", | ||
| "message_count": 5, | ||
| "model": "gemini/gemini-2.0-flash" | ||
| }, | ||
| { | ||
| "conversation_id": "456e7890-e12b-34d5-a678-901234567890", | ||
| "created_at": "2024-01-01T01:00:00Z", | ||
| "message_count": 2, | ||
| "model": "gemini/gemini-2.5-flash" | ||
| } | ||
| ] | ||
| } | ||
| ] | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix ConversationsListResponse examples: replace model with last_used_model and include last_used_provider
Examples use a non-existent model field.
"ConversationsListResponse": {
"properties": { ... },
"type": "object",
"required": ["conversations"],
"title": "ConversationsListResponse",
- "description": "Model representing a response for listing conversations of a user.\n\nAttributes:\n conversations: List of conversation details associated with the user.\n\nExample:\n ```python\n conversations_list = ConversationsListResponse(\n conversations=[\n ConversationDetails(\n conversation_id=\"123e4567-e89b-12d3-a456-426614174000\",\n created_at=\"2024-01-01T00:00:00Z\",\n last_message_at=\"2024-01-01T00:05:00Z\",\n message_count=5,\n model=\"gemini/gemini-2.0-flash\"\n ),\n ConversationDetails(\n conversation_id=\"456e7890-e12b-34d5-a678-901234567890\"\n created_at=\"2024-01-01T01:00:00Z\",\n message_count=2,\n model=\"gemini/gemini-2.5-flash\"\n )\n ]\n )\n ```",
+ "description": "Model representing a response for listing conversations of a user.\n\nAttributes:\n conversations: List of conversation details associated with the user.\n\nExample:\n ```python\n conversations_list = ConversationsListResponse(\n conversations=[\n ConversationDetails(\n conversation_id=\"123e4567-e89b-12d3-a456-426614174000\",\n created_at=\"2024-01-01T00:00:00Z\",\n last_message_at=\"2024-01-01T00:05:00Z\",\n message_count=5,\n last_used_model=\"gemini/gemini-2.0-flash\",\n last_used_provider=\"gemini\"\n ),\n ConversationDetails(\n conversation_id=\"456e7890-e12b-34d5-a678-901234567890\",\n created_at=\"2024-01-01T01:00:00Z\",\n message_count=2,\n last_used_model=\"gemini/gemini-2.5-flash\",\n last_used_provider=\"gemini\"\n )\n ]\n )\n ```",
"examples": [
{
"conversations": [
{
"conversation_id": "123e4567-e89b-12d3-a456-426614174000",
"created_at": "2024-01-01T00:00:00Z",
"last_message_at": "2024-01-01T00:05:00Z",
- "message_count": 5,
- "model": "gemini/gemini-2.0-flash"
+ "message_count": 5,
+ "last_used_model": "gemini/gemini-2.0-flash",
+ "last_used_provider": "gemini"
},
{
"conversation_id": "456e7890-e12b-34d5-a678-901234567890",
"created_at": "2024-01-01T01:00:00Z",
- "message_count": 2,
- "model": "gemini/gemini-2.5-flash"
+ "message_count": 2,
+ "last_used_model": "gemini/gemini-2.5-flash",
+ "last_used_provider": "gemini"
}
]
}
]
}🤖 Prompt for AI Agents
In docs/openapi.json around lines 1024 to 1059, the ConversationsListResponse
examples and example Python snippet reference a non-existent "model" field;
update both the description/example block and the JSON examples to replace
"model" with "last_used_model" and add "last_used_provider" (e.g.,
last_used_model: "gemini/gemini-2.0-flash", last_used_provider: "gemini") for
each ConversationDetails entry so the examples match the actual schema.
| from typing import Optional | ||
|
|
||
| from pydantic import BaseModel, model_validator, FilePath, AnyHttpUrl, PositiveInt | ||
| from pydantic import BaseModel, model_validator, FilePath, AnyHttpUrl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use Field(default_factory=...) to avoid shared default instances and mutable defaults
Several fields instantiate BaseModel instances or lists as defaults. In pydantic, prefer Field(default_factory=...) to prevent shared state across instances.
Apply:
-from pydantic import BaseModel, model_validator, FilePath, AnyHttpUrl
+from pydantic import BaseModel, model_validator, FilePath, AnyHttpUrl, Field
@@
- tls_config: TLSConfiguration = TLSConfiguration()
+ tls_config: TLSConfiguration = Field(default_factory=TLSConfiguration)
@@
-class JwkConfiguration(BaseModel):
+class JwkConfiguration(BaseModel):
@@
- jwt_configuration: JwtConfiguration = JwtConfiguration()
+ jwt_configuration: JwtConfiguration = Field(default_factory=JwtConfiguration)
@@
class Configuration(BaseModel):
@@
- database: DatabaseConfiguration = DatabaseConfiguration()
- mcp_servers: list[ModelContextProtocolServer] = []
- authentication: Optional[AuthenticationConfiguration] = (
- AuthenticationConfiguration()
- )
+ database: DatabaseConfiguration = Field(default_factory=DatabaseConfiguration)
+ mcp_servers: list[ModelContextProtocolServer] = Field(default_factory=list)
+ authentication: Optional[AuthenticationConfiguration] = Field(
+ default_factory=AuthenticationConfiguration
+ )
customization: Optional[Customization] = None
- inference: InferenceConfiguration = InferenceConfiguration()
+ inference: InferenceConfiguration = Field(default_factory=InferenceConfiguration)Also applies to: 108-109, 199-208, 289-300
🤖 Prompt for AI Agents
In src/models/config.py around lines 6 (and also apply same change at 108-109,
199-208, 289-300), several Pydantic model fields are using mutable default
values (BaseModel instances or lists) directly; replace those direct defaults
with Field(default_factory=...) so each model instance gets its own fresh
object; update the import to include Field if missing and change each offending
field declaration to use Field(default_factory=lambda: <default>) or
Field(default_factory=SomeModel) as appropriate.
|
There is a bigger change in |
f961d3f to
a9c1097
Compare
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Complete removal of the data collector service and all its references from the codebase. The data collector functionality has been separated into its own service/repository.
Type of change
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
Chores
Documentation
Tests