Skip to content

Conversation

@jcnelson
Copy link
Member

This actually fixes #3860, which was closed prematurely since it was not addressed by PR #3992. Doing this now because this is a blocker for the block sync system.

This PR creates a separate DB for holding Nakamoto blocks, separate from the DB that holds processed Nakamoto block headers. The reason this is necessary is because we can expect writes to occur concurrently on both DBs -- the network stack will create transactions against the staging blocks DB to store blocks it receives, and both the miner and chains-coordinator will create transactions on the headers DB as they create / process blocks. In Stacks today, these two data live in the same DB, and it regularly leads to DB contention since the chains-coordinator's and miner's block-processing regularly prevents the block downloader from storing downloaded blocks (which slows down IBD and block delivery).

Because latency is much more of a concern in Nakamoto, it's going to be important that the system can open both kinds of transactions concurrently. This PR makes it possible by storing staging blocks separately from processed block headers.

The most substantial change in this PR is that the act of updating the processed flag on a block no longer executes within the same transaction that writes the block header. However, this shouldn't lead to any concurrency issues because (1) only the chains coordinator thread performs these tasks, and (2) the code to query processed blocks has been updated to read from both the staging blocks DB and headers DB (this won't block ongoing writes, or be blocked by ongoing writes, since we use WAL mode in both DBs)

… to one without blocking writes to the other. This required some changes in how we handle processed blocks, but as long as there's a single thread responsible for finding and processing blocks, we should be fine
@codecov
Copy link

codecov bot commented Feb 13, 2024

Codecov Report

Attention: 125 lines in your changes are missing coverage. Please review.

Comparison is base (2892f05) 76.81% compared to head (df95923) 83.29%.

Files Patch % Lines
stackslib/src/chainstate/nakamoto/mod.rs 57.14% 69 Missing ⚠️
stackslib/src/chainstate/burn/db/sortdb.rs 60.00% 20 Missing ⚠️
...tackslib/src/chainstate/nakamoto/staging_blocks.rs 89.79% 20 Missing ⚠️
stackslib/src/chainstate/nakamoto/tests/mod.rs 91.09% 13 Missing ⚠️
testnet/stacks-node/src/nakamoto_node/miner.rs 50.00% 2 Missing ⚠️
stackslib/src/util_lib/db.rs 95.23% 1 Missing ⚠️
Additional details and impacted files
@@                         Coverage Diff                         @@
##           feat/nakamoto-inv-state-machine    #4373      +/-   ##
===================================================================
+ Coverage                            76.81%   83.29%   +6.47%     
===================================================================
  Files                                  447      448       +1     
  Lines                               321012   321363     +351     
===================================================================
+ Hits                                246576   267668   +21092     
+ Misses                               74436    53695   -20741     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@kantai kantai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM -- just have some pretty superficial comments.

// within the same, single thread. Also, it's *very important* that this update
// succeeds, since *we have already processed* the block.

loop {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two loops in this function should be (at least) factored out of this function. I'm also not sure why this looping retry logic needs to be here anyways -- other tx_begin(), execute, and commit() blocks don't have this logic, so why is this specifically necessary here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we commit the block, we must update the processed/orphaned flags in the staging DB. Failure to do that is either a panic (which leaves the node in a corrupt state) or a loop (which at least gives the operator a chance to fix the underlying issue).

@jcnelson jcnelson requested a review from kantai February 15, 2024 20:57
let pox_constants = sort_tx.context.pox_constants.clone();
let (receipt, clarity_commit) = match NakamotoChainState::append_block(

// NOTE: because block status is updated in a separate transaction, we need `chainstate_tx`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha sneaky sneaky. Very helpful comment.

Copy link
Contributor

@kantai kantai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- my only comment is that lazy_static should be avoided if possible.

@jcnelson jcnelson merged commit d4b67fc into feat/nakamoto-inv-state-machine Feb 16, 2024
@blockstack-devops
Copy link
Contributor

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@stacks-network stacks-network locked as resolved and limited conversation to collaborators Nov 5, 2024
@wileyj wileyj deleted the feat/separate-nakamoto-staging-db branch March 11, 2025 21:20
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants