Read only uses #9237

alphaprinz · 2025-10-07T16:16:42Z

Describe the Problem

Use new 'read only' pool (connected to cnpg replicated read-only cluster) in order to alleviate load from main, writable pg cluster.

Explain the Changes

Move some background queries to read-only pool:
-db cleancer
-object reclaimer's first find
-scrubber's first find

Issues: Fixed #xxx / Gap #xxx

Testing Instructions:

Doc added/updated
Tests added

Summary by CodeRabbit

New Features
- Introduced optional pool preference parameter for SQL query execution, allowing fine-grained control over resource routing.
Improvements
- Read operations optimized with automatic routing for improved performance and resource utilization.

coderabbitai · 2025-10-07T16:16:52Z

Walkthrough

This PR implements per-query read-only pool selection for database operations. The Postgres client is updated to support a preferred_pool parameter that routes read operations to dedicated read replicas. MDStore and other services are modified to specify read-only pool routing for read queries. TypeScript types and debug logging are also updated accordingly.

Changes

Cohort / File(s)	Summary
SDK TypeScript Declarations `src/sdk/nb.d.ts`	Added `preferred_pool?: string` to the `executeSQL` options parameter in `DBCollection` interface to support per-query pool selection.
Postgres Client Implementation `src/util/postgres_client.js`	Implemented pool selection routing via `options.preferred_pool` in `executeSQL`, `find`, and `findOne` methods. Added fallback logic in `get_pool` to handle pool key mismatches and extended JSDoc documentation.
MDStore Read Paths `src/server/object_services/md_store.js`	Added `preferred_pool: 'read_only'` to multiple read-query operations including find, findOne, mapReduce, and SQL execute calls to route read operations to read-replica pools.
Logging Adjustments `src/server/bg_services/db_cleaner.js`	Reduced logging verbosity by replacing debug log level 2 (log2) with level 0 (log0) for three MDStore-related outputs.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as Service/Query Caller
    participant Client as PostgresTable/MDStore
    participant PoolMgr as Pool Manager (get_pool)
    participant Pool as Readonly/Default Pool
    participant DB as Database

    Caller->>Client: executeSQL/find(query, params, {preferred_pool: 'read_only'})
    Client->>PoolMgr: get_pool('read_only')
    alt Pool exists
        PoolMgr-->>Client: return read_only pool
    else Fallback
        PoolMgr->>PoolMgr: fallback to default pool
        PoolMgr-->>Client: return default pool
    end
    Client->>Pool: execute query
    Pool->>DB: route to read replica
    DB-->>Pool: result
    Pool-->>Client: return result
    Client-->>Caller: return data

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–25 minutes

The changes involve new pool selection logic with fallback handling in the Postgres client, repetitive but consistent addition of preferred_pool across MDStore read operations, TypeScript interface updates, and minor logging adjustments. The homogeneous pattern in MDStore reduces cognitive load despite the breadth of files affected.

Possibly related PRs

performance - read only pool POC #9197 — Implements complementary read-only pool support and per-query pool selection infrastructure in the Postgres client and related services.

Suggested reviewers

dannyzaken

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "Read only uses" is directly related to the main changes in the changeset. The modifications across all files serve a single, cohesive purpose: enabling queries to route to a read-only pool by adding support for a `preferred_pool` parameter and then applying it to various read operations in the background services. The title, while terse, clearly conveys this primary objective and would help teammates quickly understand that the PR concerns read-only pool utilization. The changes are not vague or off-topic—they are specifically about leveraging a read-only pool to reduce load on the main database cluster.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Amit Prinz Setter <[email protected]>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/util/postgres_client.js (1)
632-642: Review the fallback behavior in get_pool.

The fallback logic on lines 636-637 silently falls back to the default pool when a requested pool key is invalid:
if (key && key !== this.pool_key) {
    return this.get_pool();
}
Concerns:

Silent fallback masks configuration issues: If code requests a non-existent pool (e.g., typo like 'read_onyl'), it silently uses the default pool instead of failing fast.

Misleading error message: After the fallback attempt, line 639 throws an error mentioning the requested key, but the fallback might have already succeeded with this.pool_key.

Recommendation:
Consider removing the fallback or adding a warning log when falling back, so configuration issues are visible:
 get_pool(key = this.pool_key) {
     const pool = this.client.get_pool(key);
     if (!pool) {
-        //if original get_pool was no for the default this.pool_key, try also this.pool_key
+        // If original get_pool was not for the default this.pool_key, try also this.pool_key
         if (key && key !== this.pool_key) {
+            dbg.warn(`Pool ${key} not found, falling back to default pool ${this.pool_key}`);
             return this.get_pool();
         }
         throw new Error(`The postgres clients pool ${key} disconnected`);
     }
     return pool;
 }
Alternatively, remove the fallback entirely to fail fast on invalid pool names.
src/server/object_services/md_store.js (1)

814-835: Route mapReduce operations to read-only pool for consistent replica distribution.

The list_objects and list_object_versions methods use mapReduce for queries with delimiters (lines 814-824 and 864-874), but these calls don't pass preferred_pool to route to the read-only pool. While the find() operations support routing via preferred_pool: 'read_only' (as seen elsewhere in md_store.js), the mapReduce calls should be consistent.

To fix:

Update mapReduceListObjects() and mapReduceAggregate() in postgres_client.js to extract options.preferred_pool and pass it to this.single_query() (currently they call this.single_query(mr_q) without the pool parameter)

Pass preferred_pool: 'read_only' in the options object when calling this._objects.mapReduce() at lines 814-824 and 864-874 in md_store.js

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2367740 and d699655.

📒 Files selected for processing (4)

src/sdk/nb.d.ts (1 hunks)
src/server/bg_services/db_cleaner.js (1 hunks)
src/server/object_services/md_store.js (5 hunks)
src/util/postgres_client.js (5 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-08-08T13:12:46.728Z

Learnt from: naveenpaul1
PR: noobaa/noobaa-core#9182
File: src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js:9-17
Timestamp: 2025-08-08T13:12:46.728Z
Learning: In upgrade script src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js for noobaa-core, rely on structural detection (e.g., pool.mongo_info, and resource_type === 'INTERNAL') with name-prefix fallback for removing legacy mongo/internal pools, instead of depending solely on config.INTERNAL_STORAGE_POOL_NAME or config.DEFAULT_POOL_NAME. Handle multi-system stores and remove all matching pools in one change.

Applied to files:

src/server/object_services/md_store.js

🧬 Code graph analysis (1)

src/server/bg_services/db_cleaner.js (1)

src/server/object_services/md_store.js (3)

dbg (14-14)

P (13-13)

config (28-28)

🔇 Additional comments (10)

src/server/object_services/md_store.js (6)

779-779: LGTM! Read-only pool routing for unreclaimed objects.

Correctly routes the find operation to the read-only pool, which aligns with the PR objective of offloading background read queries from the main cluster.

1127-1127: LGTM! Read-only pool routing for deleted objects query.

The SQL query correctly uses the read-only pool via the options parameter, consistent with the pattern established in this PR.

1603-1603: LGTM! Read-only pool routing for chunk iteration.

Appropriately routes the chunk iteration query to the read-only pool for the scrubber's background operations.

1778-1779: LGTM! Read-only pool routing for deleted chunks.

Correctly adds preferred_pool: 'read_only' to the find options for deleted chunks cleanup query.

1787-1788: LGTM! Read-only pool routing for chunk existence checks.

Both has_any_blocks_for_chunk and has_any_parts_for_chunk correctly route to the read-only pool. These existence checks are read-only operations suitable for read replicas.

Also applies to: 1796-1797

2039-2040: LGTM! Read-only pool routing for deleted blocks.

Correctly routes the deleted blocks query to the read-only pool for background cleanup operations.

src/util/postgres_client.js (3)

252-252: Helpful debug logging for pool routing.

The added debug log that includes pg_client.options?.host will be useful for verifying which pool (and host) is being used for each query, aiding in troubleshooting read-only pool routing.

726-733: LGTM! Clean implementation of preferred_pool in executeSQL.

The JSDoc and implementation correctly support routing queries to a preferred pool. The pattern of passing options.preferred_pool to get_pool() is clean and extensible.

937-937: LGTM! Consistent preferred_pool support in find and findOne.

Both methods correctly pass options.preferred_pool to get_pool(), enabling read-only pool routing for these common query methods. The implementation is consistent with the executeSQL pattern.

Also applies to: 954-954

src/sdk/nb.d.ts (1)

769-769: LGTM! TypeScript interface correctly updated.

The addition of preferred_pool?: string to the executeSQL options parameter aligns with the implementation in postgres_client.js and properly documents the new functionality for TypeScript consumers.

coderabbitai · 2025-10-17T22:53:17Z

src/server/bg_services/db_cleaner.js

    dbg.log0('DB_CLEANER: checking md-store for documents deleted before', new Date(last_date_to_remove));
    const objects_to_remove = await MDStore.instance().find_deleted_objects(last_date_to_remove, config.DB_CLEANER_DOCS_LIMIT);
-    dbg.log2('DB_CLEANER: list objects:', objects_to_remove);
+    dbg.log0('DB_CLEANER: list objects:', objects_to_remove);


⚠️ Potential issue | 🟡 Minor

Consider reverting to log2 or using log1 for these array logs.

Changing from log2 to log0 makes these logs print at the default verbosity level. Since objects_to_remove, blocks_to_remove, and filtered_chunks can contain up to config.DB_CLEANER_DOCS_LIMIT items (typically 1000), logging full arrays at log0 could be excessively verbose in production environments. Consider using log1 for informational output or keeping log2 for verbose debugging.

- dbg.log0('DB_CLEANER: list objects:', objects_to_remove); + dbg.log1('DB_CLEANER: list objects:', objects_to_remove);

Apply similar changes to lines 71 and 77.

Also applies to: 71-71, 77-77

🤖 Prompt for AI Agents

In src/server/bg_services/db_cleaner.js around lines 65, 71 and 77, currently dbg.log0 is used to print large arrays (objects_to_remove, blocks_to_remove, filtered_chunks) which is too verbose for default logging; change those dbg.log0 calls back to dbg.log2 (or dbg.log1 if you want informational but not very verbose output) so array dumps remain at a non-default verbosity level, preserving dbg.log0 for concise production logs; update the three lines accordingly and keep the log message text the same so only the log level is adjusted.

alphaprinz requested a review from dannyzaken October 7, 2025 16:16

pull-request-size bot added the size/M label Oct 7, 2025

alphaprinz added 3 commits October 13, 2025 20:50

use read only pool - db cleaner

8092131

Signed-off-by: Amit Prinz Setter <[email protected]>

use read only pool - object reclaimer first find

a4039d9

Signed-off-by: Amit Prinz Setter <[email protected]>

use read only pool - scrubber first find

d699655

Signed-off-by: Amit Prinz Setter <[email protected]>

alphaprinz force-pushed the read_only_uses branch from 0a008b6 to d699655 Compare October 14, 2025 03:50

alphaprinz marked this pull request as ready for review October 17, 2025 22:47

coderabbitai bot reviewed Oct 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Read only uses #9237

Read only uses #9237

alphaprinz commented Oct 7, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 7, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Read only uses #9237

Are you sure you want to change the base?

Read only uses #9237

Conversation

alphaprinz commented Oct 7, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the Problem

Explain the Changes

Issues: Fixed #xxx / Gap #xxx

Testing Instructions:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alphaprinz commented Oct 7, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 7, 2025 •

edited

Loading