Skip to content

Conversation

@randy-cro
Copy link

@randy-cro randy-cro commented Sep 24, 2025

Description

This PR optimises the staking endblocker.
It was discovered that on RocksDB archival nodes, the staking endblocker could take up to 1100ms, causing them to consistently lag behind pruned nodes. As such, there is an urgent need to improve block sync performance.


Root Cause

  • Fetching the following iterators was slow:
    • ValidatorQueueIterator
    • UBDQueueIterator
    • RedelegationQueueIterator
  • Each iterator took ~300ms to return, even when there were no entries, due to excessively large scan ranges.

Changes Made

  • Cache unbonding validators, delegations and redelegations :
    Instead of scanning the database from the beginning of time to the latest block height or timestamp on every block, an in-memory cache now stores these entries, significantly reducing I/O. The iterator is invoked only once during cache initialization when the node starts.

Results

With telemetry metrics enabled, we observed a significant performance improvement after these optimisations were applied.

Before:
image

After:
image

Closes: #XXXX


Author Checklist

All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues. Your PR will not be merged unless you satisfy
all of these items.

I have...

  • included the correct type prefix in the PR title, you can find examples of the prefixes below:
  • confirmed ! in the type prefix if API or client breaking change
  • targeted the correct branch (see PR Targeting)
  • provided a link to the relevant issue or specification
  • reviewed "Files changed" and left comments if necessary
  • included the necessary unit and integration tests
  • added a changelog entry to CHANGELOG.md
  • updated the relevant documentation or specification, including comments for documenting Go code
  • confirmed all CI checks have passed

@randy-cro randy-cro changed the base branch from release/v0.53.x to release/v0.50.x September 24, 2025 10:36
@randy-cro randy-cro force-pushed the debug/validator-queue branch from 6bc0bee to 99af2c5 Compare September 24, 2025 10:39
@randy-cro randy-cro force-pushed the debug/validator-queue branch from 464fb46 to f1c53f9 Compare October 14, 2025 08:24
Copy link
Collaborator

@thomas-nguy thomas-nguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@randy-cro randy-cro merged commit d8ed3ff into crypto-org-chain:release/v0.50.x Oct 15, 2025
46 of 48 checks passed
@randy-cro randy-cro deleted the debug/validator-queue branch October 15, 2025 03:02
randy-cro added a commit that referenced this pull request Oct 21, 2025
perf: optimise staking endblocker (#1725)
randy-cro added a commit to randy-cro/cosmos-sdk that referenced this pull request Oct 22, 2025
randy-cro added a commit that referenced this pull request Oct 28, 2025
perf: optimise staking endblocker (#1725)
normalize cache validator queue key to be UTC (#1730)
thomas-nguy pushed a commit that referenced this pull request Oct 31, 2025
randy-cro added a commit to randy-cro/cosmos-sdk that referenced this pull request Nov 17, 2025
randy-cro added a commit that referenced this pull request Nov 17, 2025
* Revert "perf: optimise staking endblocker (#1725)"

This reverts commit d8ed3ff.

* gofumpt
randy-cro added a commit to randy-cro/cosmos-sdk that referenced this pull request Nov 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants