-
Notifications
You must be signed in to change notification settings - Fork 89
Read only uses #9237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Read only uses #9237
Conversation
WalkthroughThis PR implements per-query read-only pool selection for database operations. The Postgres client is updated to support a Changes
Sequence Diagram(s)sequenceDiagram
participant Caller as Service/Query Caller
participant Client as PostgresTable/MDStore
participant PoolMgr as Pool Manager (get_pool)
participant Pool as Readonly/Default Pool
participant DB as Database
Caller->>Client: executeSQL/find(query, params, {preferred_pool: 'read_only'})
Client->>PoolMgr: get_pool('read_only')
alt Pool exists
PoolMgr-->>Client: return read_only pool
else Fallback
PoolMgr->>PoolMgr: fallback to default pool
PoolMgr-->>Client: return default pool
end
Client->>Pool: execute query
Pool->>DB: route to read replica
DB-->>Pool: result
Pool-->>Client: return result
Client-->>Caller: return data
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–25 minutes The changes involve new pool selection logic with fallback handling in the Postgres client, repetitive but consistent addition of Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Signed-off-by: Amit Prinz Setter <[email protected]>
Signed-off-by: Amit Prinz Setter <[email protected]>
Signed-off-by: Amit Prinz Setter <[email protected]>
0a008b6
to
d699655
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/util/postgres_client.js (1)
632-642
: Review the fallback behavior in get_pool.The fallback logic on lines 636-637 silently falls back to the default pool when a requested pool key is invalid:
if (key && key !== this.pool_key) { return this.get_pool(); }Concerns:
- Silent fallback masks configuration issues: If code requests a non-existent pool (e.g., typo like
'read_onyl'
), it silently uses the default pool instead of failing fast.- Misleading error message: After the fallback attempt, line 639 throws an error mentioning the requested
key
, but the fallback might have already succeeded withthis.pool_key
.Recommendation:
Consider removing the fallback or adding a warning log when falling back, so configuration issues are visible:get_pool(key = this.pool_key) { const pool = this.client.get_pool(key); if (!pool) { - //if original get_pool was no for the default this.pool_key, try also this.pool_key + // If original get_pool was not for the default this.pool_key, try also this.pool_key if (key && key !== this.pool_key) { + dbg.warn(`Pool ${key} not found, falling back to default pool ${this.pool_key}`); return this.get_pool(); } throw new Error(`The postgres clients pool ${key} disconnected`); } return pool; }Alternatively, remove the fallback entirely to fail fast on invalid pool names.
src/server/object_services/md_store.js (1)
814-835
: Route mapReduce operations to read-only pool for consistent replica distribution.The
list_objects
andlist_object_versions
methods usemapReduce
for queries with delimiters (lines 814-824 and 864-874), but these calls don't passpreferred_pool
to route to the read-only pool. While thefind()
operations support routing viapreferred_pool: 'read_only'
(as seen elsewhere in md_store.js), the mapReduce calls should be consistent.To fix:
- Update
mapReduceListObjects()
andmapReduceAggregate()
in postgres_client.js to extractoptions.preferred_pool
and pass it tothis.single_query()
(currently they callthis.single_query(mr_q)
without the pool parameter)- Pass
preferred_pool: 'read_only'
in the options object when callingthis._objects.mapReduce()
at lines 814-824 and 864-874 in md_store.js
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/sdk/nb.d.ts
(1 hunks)src/server/bg_services/db_cleaner.js
(1 hunks)src/server/object_services/md_store.js
(5 hunks)src/util/postgres_client.js
(5 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-08T13:12:46.728Z
Learnt from: naveenpaul1
PR: noobaa/noobaa-core#9182
File: src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js:9-17
Timestamp: 2025-08-08T13:12:46.728Z
Learning: In upgrade script src/upgrade/upgrade_scripts/5.20.0/remove_mongo_pool.js for noobaa-core, rely on structural detection (e.g., pool.mongo_info, and resource_type === 'INTERNAL') with name-prefix fallback for removing legacy mongo/internal pools, instead of depending solely on config.INTERNAL_STORAGE_POOL_NAME or config.DEFAULT_POOL_NAME. Handle multi-system stores and remove all matching pools in one change.
Applied to files:
src/server/object_services/md_store.js
🧬 Code graph analysis (1)
src/server/bg_services/db_cleaner.js (1)
src/server/object_services/md_store.js (3)
dbg
(14-14)P
(13-13)config
(28-28)
🔇 Additional comments (10)
src/server/object_services/md_store.js (6)
779-779
: LGTM! Read-only pool routing for unreclaimed objects.Correctly routes the find operation to the read-only pool, which aligns with the PR objective of offloading background read queries from the main cluster.
1127-1127
: LGTM! Read-only pool routing for deleted objects query.The SQL query correctly uses the read-only pool via the options parameter, consistent with the pattern established in this PR.
1603-1603
: LGTM! Read-only pool routing for chunk iteration.Appropriately routes the chunk iteration query to the read-only pool for the scrubber's background operations.
1778-1779
: LGTM! Read-only pool routing for deleted chunks.Correctly adds
preferred_pool: 'read_only'
to the find options for deleted chunks cleanup query.
1787-1788
: LGTM! Read-only pool routing for chunk existence checks.Both
has_any_blocks_for_chunk
andhas_any_parts_for_chunk
correctly route to the read-only pool. These existence checks are read-only operations suitable for read replicas.Also applies to: 1796-1797
2039-2040
: LGTM! Read-only pool routing for deleted blocks.Correctly routes the deleted blocks query to the read-only pool for background cleanup operations.
src/util/postgres_client.js (3)
252-252
: Helpful debug logging for pool routing.The added debug log that includes
pg_client.options?.host
will be useful for verifying which pool (and host) is being used for each query, aiding in troubleshooting read-only pool routing.
726-733
: LGTM! Clean implementation of preferred_pool in executeSQL.The JSDoc and implementation correctly support routing queries to a preferred pool. The pattern of passing
options.preferred_pool
toget_pool()
is clean and extensible.
937-937
: LGTM! Consistent preferred_pool support in find and findOne.Both methods correctly pass
options.preferred_pool
toget_pool()
, enabling read-only pool routing for these common query methods. The implementation is consistent with the executeSQL pattern.Also applies to: 954-954
src/sdk/nb.d.ts (1)
769-769
: LGTM! TypeScript interface correctly updated.The addition of
preferred_pool?: string
to the executeSQL options parameter aligns with the implementation in postgres_client.js and properly documents the new functionality for TypeScript consumers.
dbg.log0('DB_CLEANER: checking md-store for documents deleted before', new Date(last_date_to_remove)); | ||
const objects_to_remove = await MDStore.instance().find_deleted_objects(last_date_to_remove, config.DB_CLEANER_DOCS_LIMIT); | ||
dbg.log2('DB_CLEANER: list objects:', objects_to_remove); | ||
dbg.log0('DB_CLEANER: list objects:', objects_to_remove); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider reverting to log2 or using log1 for these array logs.
Changing from log2
to log0
makes these logs print at the default verbosity level. Since objects_to_remove
, blocks_to_remove
, and filtered_chunks
can contain up to config.DB_CLEANER_DOCS_LIMIT
items (typically 1000), logging full arrays at log0 could be excessively verbose in production environments. Consider using log1
for informational output or keeping log2
for verbose debugging.
- dbg.log0('DB_CLEANER: list objects:', objects_to_remove);
+ dbg.log1('DB_CLEANER: list objects:', objects_to_remove);
Apply similar changes to lines 71 and 77.
Also applies to: 71-71, 77-77
🤖 Prompt for AI Agents
In src/server/bg_services/db_cleaner.js around lines 65, 71 and 77, currently
dbg.log0 is used to print large arrays (objects_to_remove, blocks_to_remove,
filtered_chunks) which is too verbose for default logging; change those dbg.log0
calls back to dbg.log2 (or dbg.log1 if you want informational but not very
verbose output) so array dumps remain at a non-default verbosity level,
preserving dbg.log0 for concise production logs; update the three lines
accordingly and keep the log message text the same so only the log level is
adjusted.
Describe the Problem
Use new 'read only' pool (connected to cnpg replicated read-only cluster) in order to alleviate load from main, writable pg cluster.
Explain the Changes
-db cleancer
-object reclaimer's first find
-scrubber's first find
Issues: Fixed #xxx / Gap #xxx
Testing Instructions:
Summary by CodeRabbit
New Features
Improvements