-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Light friendly storage tracking: changes trie + extending over ranges #628
Conversation
|
4 probably won't be needed as we'll restart the testnet and any other chains should begin with this on. |
|
@gavofyork I assumed that this PR is most useful for light nodes (though it could be utilized by full nodes as well). And I could remember that @rphmeier has said previously that not all substrate-based chains require a light clients => that's why I made a So if the decision is to build changes tries for all chains, this TODO should be renamed to "make |
|
I would consider putting |
substrate/runtime/system/src/lib.rs
Outdated
| pub ExtrinsicIndex get(extrinsic_index): b"sys:xti" => u32; | ||
| ExtrinsicData get(extrinsic_data): b"sys:xtd" => required map [ u32 => Vec<u8> ]; | ||
| RandomSeed get(random_seed): b"sys:rnd" => required T::Hash; | ||
| ChangesTrieCreationEnabled get(changes_trie_creation_enabled): b"sys:changes_trie_creation_enabled" => default bool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everything in storage is already an option, so bools are generally superfluous. probably better done as:
ChangesTrie get(changes_trie) => ChangesTrieConfig;
and then define ChangesTrieConfig:
#[derive(Encode, Decode, PartialEq, Eq, Clone)]
struct ChangesTrieConfig {
interval: u32,
levels: u32,
}
then it's just if let Some(config) = Self::changes_trie() { /* do stuff */ }
| use changes_trie::storage::InMemoryStorage; | ||
| use super::*; | ||
|
|
||
| fn prepare_for_drilldown() -> (Configuration, InMemoryStorage<KeccakHasher>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be using Blake2Hasher now.
|
Looks good otherwise! |
|
|
The rebasing-upkeep is no-doubt non-trivial, so I'm happy to figure this out sooner rather than later. Issue with using |
| @@ -0,0 +1,194 @@ | |||
| [[package]] | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this checked in accidentally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops...Thanks! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly looks reasonable! Few style nits and questions
| pub(crate) changes_trie_config: Option<ChangesTrieConfig>, | ||
| pub(crate) block: Option<u64>, | ||
| pub(crate) extrinsic: Option<u32>, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This structure has become unwieldy.
I'm proposing to make some changes:
- Explicitly name tuples that are used as values in HashMap for
prospectiveandcommited. It's is not immediatelly clear why it has such arguments. Also usages likeentry.0andentry.1will become much cleaner. - Leave a comment what exactly
blockandextrinsicmean and why they're optional, etc.
|
|
||
| impl OverlayedChanges { | ||
| /// Sets the changes trie configuration. | ||
| pub(crate) fn set_changes_trie_config(&mut self, block: u64, config: ChangesTrieConfig) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function has some some pre-conditionals. I think they're worth documenting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So because they're ultimately used by the runtime ideally all functions in the chain should contain a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to require these pre-conditions? Why, for example, set_extrinsic_index doesn't panic if changes_trie_config is none but just no-op? Should we do the same here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting extrinsic index is noop in case if config hasn't been set.
Setting different config leads to the creation of invalid changes trie.
Updated comment.
|
|
||
| //! Substrate changes trie configuration. | ||
| /// An identifier for an authority in the consensus algorithm. The same size as ed25519::Public. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update the comment?
| Ok(extrinsic_map.into_iter() | ||
| .map(move |(key, extrinsics)| InputPair::ExtrinsicIndex(ExtrinsicIndex { | ||
| block, | ||
| key: key.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need clone() here?
substrate/runtime-io/without_std.rs
Outdated
| let mut is_set: u8 = 0; | ||
| let mut result: [u8; 32] = Default::default(); | ||
| unsafe { | ||
| ext_storage_changes_root(&mut is_set, result.as_mut_ptr()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we need is_set here? Wouldn't it be easier to just return it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Option is not ConvertibleToWasm. It is easier to return Option than is_set + result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry should have state it more clearer: what i meant is that we can return is_set not via pointer but by the return value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks! :)
|
|
||
| // check storage changes root | ||
| let storage_changes_root = System::Hashing::storage_changes_root(); | ||
| header.changes_root().cloned().check_equal(&storage_changes_root); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need cloned here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because CheckEqual was implemented for Option<T>. Changed to for Option<&T>.
|
|
||
| /// Changes trie storage. Provides access to trie roots and trie nodes. | ||
| pub trait Storage<H: Hasher>: Send + Sync { | ||
| //type ChangesTrieTransaction; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to not check in commented code
|
Ok - I understand a bit better now. So I'm happy that there needs to be the extrinsic I'd suggest that we simply name a couple of storage items:
We can then move the system Generic arguments should be removed by making assumptions about the type of trie in the same way that is currently the case with |
And once again - I'm not changing the trait - I'm adding missing generic parameter to the implementation we use in tests - |
|
Fair enough - looks good. |
pepyakin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will perform more thorough review a bit later today!
| "checksum owning_ref 0.3.3 (registry+https://github.com/rust-lang/crates.io-index)" = "cdf84f41639e037b484f93433aa3897863b561ed65c6e59c7073d7c561710f37" | ||
| "checksum parity-bytes 0.1.0 (registry+https://github.com/rust-lang/crates.io-index)" = "fa5168b4cf41f3835e4bc6ffb32f51bc9365dc50cb351904595b3931d917fd0c" | ||
| "checksum parity-codec 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "9260216bbb7acdbb72dfb6bcb63245ef5bde9a8fb7706fe641e1caf1e7ae460f" | ||
| "checksum parity-codec-derive 1.0.0 (registry+https://github.com/rust-lang/crates.io-index)" = "1eda64d782c342261aea4ca047a609f9bd92d5f9dafabe6b5a396caf5c7b8827" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related to this line/file:
There are some wasm files was checked. I think this was incidental, right?
node/executor/src/lib.rs
Outdated
| } | ||
|
|
||
| fn construct_block(number: BlockNumber, parent_hash: Hash, state_root: Hash, extrinsics: Vec<CheckedExtrinsic>) -> (Vec<u8>, Hash) { | ||
| fn construct_block(number: BlockNumber, parent_hash: Hash, state_root: Hash, changes_root: Option<Hash>, extrinsics: Vec<CheckedExtrinsic>) -> (Vec<u8>, Hash) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we wrap this line?
| b"aba".to_vec() => None, | ||
| b"abd".to_vec() => None, | ||
| b"abc".to_vec() => None.into(), | ||
| b"abb".to_vec() => None.into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why this were changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously: OverlayedChanges.committed: HashMap<Vec<u8>, Option<Vec<u8>>> now OverlayedChanges.committed: HashMap<Vec<u8>, OverlayedChange>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry! I meant the order, not types : )
srml/system/src/lib.rs
Outdated
| } | ||
|
|
||
| /// Sets the index of extrinsic that is currenty executing. | ||
| pub fn set_extrinsic_index(extrinsic_index: u32) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used exclusively for tests? If so, let's put #[cfg(any(feature = "std", test))] on it
pepyakin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work here! There is a lot of stuff going on there and I can't say that I looked into every corner. But where I looked it looks pretty reasonable.
There are a few places where I'd add some docs / do renamings / some refactor. I've noted a few of them.
That said, I don't want to block merging for this, on the contrary: I want to merge this ASAP and fix other issues in following PRs.
@svyatonik can you rebase this please?
| /// | ||
| /// Returns false if configuration has been set already and we now trying | ||
| /// to install different configuration. This isn't supported now. | ||
| pub(crate) fn set_changes_trie_config(&mut self, config: ChangesTrieConfig) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider marking #[must_use] until then.
| if let Some(extrinsic) = extrinsic_index { | ||
| let mut extrinsics = entry.extrinsics.take().unwrap_or_default(); | ||
| extrinsics.insert(extrinsic); | ||
| entry.extrinsics = Some(extrinsics); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about this way of achieving the same
entry.extrinsics.get_or_insert_with(Default::default).insert(extrinsic);?
| // is created after OverlayedChanges | ||
|
|
||
| let changes_trie_config = try_read_overlay_value(overlay, backend, b":changes_trie")?; | ||
| set_changes_trie_config(overlay, changes_trie_config)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_changes_trie_config has the same name as overlay.set_changes_trie_config but actually also performs a decoding and handling None. Consider renaming this method.
| return DigestBuildIterator::empty(); | ||
| } | ||
|
|
||
| // digest is buievery digest_multiplier blocks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: is built
| begin: u64, | ||
| end: u64, | ||
| max: u64, | ||
| key: &[u8], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does max mean?
| max: u64, | ||
| begin: u64, | ||
| end: u64, | ||
| ) -> Result<(u64, u64, u64, u32), String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be cool to extract struct for the return value.
Implementation of sections "Ordered, indexed, Merklised per-block change-trie" and "Extending over ranges" of #131
Changes trie is built at the end of each block and contains mapping of changed keys to the indices of extrinsics where they have been changed. E.g. if extrinsic#5 has changed FreeBalanceOfAlice from 500 to 400 and extrinsic#8 has changed free FreeBalanceOfBob from 300 to 200, the input for changes trie will be:
Note that comparing to original description from #131, input if free of [ index => key ] pairs, because of this comment.
There are 2 configuration parameters affecting changes trie creation (right now they're configured at genesis and can't be changed later - see TODO for this):
Digest build example:
Changes tries are optional and are built upon runtime request. To announce that it wants to create changes trie, runtime provides configuration to the substrate. At the end of block, if config has been set, the root of changes trie is computed and stored in the block header. The nodes of changes trie are saved in the database on import operation commit.
Bu reading changes tries for a range of blocks, full node can find all ( block, extrinsic ) pairs where given key has been changed in by calling
key_changesmethod. Full node can provide a proof ofkey_changesexecution to light nodes.TODOs:
Fetcher/OnDemandlevel[ changed key => block number ]to speed up digests creation