Skip to content

Conversation

@mediocregopher
Copy link
Collaborator

Towards #19512

This implements the skeleton of the new proof calculator rewrite. No actual logic is implemented yet, this only sets up most of the new types which will be involved.

  • RevealedSparseNode is renamed to SparseTrieNode and moved to reth-trie-common. This is the type which will be returned from the calculator for proofs (letting avoid a translation step during sparse trie revealing). The rename helps to denote that this type is no longer just for revealing.

  • ValueEncoder is defined. This is the primary mechanism by which we deal with the differences between storage and account tries, and which allows us to inject behavior in the future like dispatching storage root calculation to other threads for account proofs.

    • ValueEncoder::Value - Either Account or U256 for account and storage tries, respectively. This is the value returned from the DB.

    • ValueEncoder::Fut - A future-like type which will be called-upon later to encode the Value into its RLP form. For storage tries (U256) this is trivial. For account tries we need some mechanism to obtain the storage root of the account. A default SyncAccountValueEncoder is provided which synchronously computes the storage root when the future is invoked, but in later PRs we can to proof workers, add in caching, etc...

  • ProofCalculator is where the actual logic of calculating proofs is going to live. For the moment it is un-implemented.

Towards #19512

In order to implement a reusable proof calculator we first need the
underlying cursors used by the calculator to be reusable.

For account cursors this isn't so difficult, a new `reset` method is
introduced which resets the ForwardInMemoryCursor and other state
fields.

For storage cursors the situation is complicated because reusing the
cursor might involve using it for a completely different hashed address.

To handle this the storage cursors are given a `set_hashed_address`
method which effectively resets them, as well as pulls out the correct
overlay for the chosen address. Implementing this requires that the
cursors now hold onto the full
`&TrieUpdatesSorted`/`&HashedPostStateSorted`.

It also requires slightly
different handling of the wiped case; before we were simply storing an
underlying cursor in an Option, with None indicating wiped, but now we
need to always have an underlying cursors (so we can reuse it for a
future non-wiped storage). A new boolean is introduced instead.
Towards #19512

This implements the skeleton of the new proof calculator rewrite. No actual logic is implemented yet, this only sets up most of the new types which will be involved.

* RevealedSparseNode is renamed to SparseTrieNode and moved to reth-trie-common. This is the type which will be returned from the calculator for proofs (letting avoid a translation step during sparse trie revealing). The rename helps to denote that this type is no longer just for revealing.

* ValueEncoder is defined. This is the primary mechanism by which we deal with the differences between storage and account tries, and which allows us to inject behavior in the future like dispatching storage root calculation to other threads for account proofs.

  * ValueEncoder::Value - Either Account or U256 for account and storage tries, respectively. This is the value returned from the DB.

  * ValueEncoder::Fut - A future-like type which will be called-upon later to encode the Value into its RLP form. For storage tries (U256) this is trivial. For account tries we need some mechanism to obtain the storage root of the account. A default `SyncAccountValueEncoder` is provided which synchronously computes the storage root when the future is invoked, but in later PRs we can to proof workers, add in caching, etc...

* ProofCalculator is where the actual logic of calculating proofs is going to live. For the moment it is un-implemented.
@github-project-automation github-project-automation bot moved this to Backlog in Reth Tracker Nov 12, 2025
Base automatically changed from mediocregopher/reusable-hashed-trie-cursors to main November 12, 2025 17:46
@mediocregopher mediocregopher marked this pull request as ready for review November 17, 2025 15:02
let trie_account = self.account.into_trie_account(storage_root);

// Encode the trie account
buf.clear();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure if the encode implementation should be responsible for clearing the output buffer

also, I see that we clear it above in map(), is it correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed one of the calls to clear, we can make sure in the ProofCalculator implementation to only pass cleared bufs into encode.

It is correct that we are clearing after the encoding in the map. Within the map we are RLP-encoding the root node of the storage trie, which we then hash to get the storage root. We then re-use the buffer later to RLP encode the TrieAccount, which contains the storage root as a field.

@shekhirin shekhirin added C-enhancement New feature or request A-trie Related to Merkle Patricia Trie implementation labels Nov 18, 2025
///
/// The encoder takes a reference to itself and a value, returning a future-like
/// type that will perform the encoding when needed.
pub trait ValueEncoder {
Copy link
Member

@yongkangc yongkangc Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non blocking, but took me a few looks to understand ValueEncoder.

do you think ProofValueEncoder might be better?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to renaming it, but tbh I don't think ProofValueEncoder adds any extra context that helps understand it. Maybe LeafValueEncoder?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats good LeafValueEncoder

Copy link
Member

@yongkangc yongkangc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i left some comments. the only part that was confusing to me was the impl encode function and why we need to do that / what it needs to replace.

i am guessing that it replaces

pub fn multiproof(
where given an Account + hashed_address, we compute the right storage_root, build a TrieAccount, and write its RLP bytes into buf so the proof can include the correct account leaf.

other than these, lgtm and left some nits that are none blocking

@github-project-automation github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Nov 18, 2025

/// Carries all information needed by a sparse trie to reveal a particular node.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct SparseTrieNode {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the rename to SparseTrieNode a bit confusing because it's returned from non sparse trie specific proof_v2 module. Should it just be TrieNode?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I avoided TrieNode because that's already a type in alloy-trie that gets re-exported from reth-trie-common, and it's one of the fields on this struct. Wdyt about ProofTrieNode? We already have ProofTrieNodeProviderFactory elsewhere so I think it makes some sense

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm!

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was quite easy to follow, all makes sense.

only have some suggestions re Fut, but also something we could change later

///
/// The returned future will be called as late as possible in the algorithm to maximize
/// the time available for parallel computation (e.g., storage root calculation).
fn encoder_fut(&self, key: B256, value: Self::Value) -> Self::Fut;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can also do -> impl ValueEncoderFut unless we need named types for this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Named types will be useful for future PRs, because we're going to be storing the "future" in an enum.

@mediocregopher
Copy link
Collaborator Author

i left some comments. the only part that was confusing to me was the impl encode function and why we need to do that / what it needs to replace.

i am guessing that it replaces

pub fn multiproof(

where given an Account + hashed_address, we compute the right storage_root, build a TrieAccount, and write its RLP bytes into buf so the proof can include the correct account leaf.

other than these, lgtm and left some nits that are none blocking

Yes that's exactly right, the idea is that we can have more flexibility with how we obtain the storage root during account proof calculation. Using the encoder we can inject existing behavior like obtaining results from storage proof workers and the missed leaves cache, but also do things which don't make sense to do currently like dispatching a storage proof job for missed leaves.

/// The type that will compute and encode the value when needed.
type DeferredEncoder: DeferredValueEncoder;

/// Returns a future-like value that will RLP-encode the value when called.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

residue of future terminology


/// Carries all information needed by a sparse trie to reveal a particular node.
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct SparseTrieNode {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm!

Copy link
Member

@yongkangc yongkangc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mediocregopher mediocregopher added this pull request to the merge queue Nov 19, 2025
Merged via the queue into main with commit c57792c Nov 19, 2025
42 checks passed
@mediocregopher mediocregopher deleted the mediocregopher/proof-rewrite-skeleton branch November 19, 2025 16:48
@github-project-automation github-project-automation bot moved this from In Progress to Done in Reth Tracker Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-trie Related to Merkle Patricia Trie implementation C-enhancement New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants