core/state/snapshot: faster snapshot generation #22504

rjl493456442 · 2021-03-16T12:03:57Z

This PR improves snapshot generation (post-snap-sync) by orders of magnitude.

How snapshot generation works now

When doing a snap sync, we download 'slices' of the trie, individually verified, and use these slices to fill our trie.
The slices of accounts/storage themselves are forgotten.

After this is done, we iterate the entire account trie, and every storage trie, to reconstruct the snapshot database.
This process takes hundreds of hours, and the disk IO is extreme during this process.

With this PR

This PR implements the basic idea described here: https://gist.github.com/rjl493456442/85832a0c760f2bafe2a69e33efe68c60 .

For starters, it stores the snapshots into the database. The account (and storage) data do not actually match up to whatever
root we'll eventually wind up on, but that's for later.

After the snap sync is done, we start generating the snapshot, as before. But this time, we have some (potentially stale) data already laying there.

We proceed in batches (ranges). In this PR, the range-size for accounts is 200, and the range-size for storage is 1024.

For any given range, we do

Ask the snapshot db for that range (e.g. give me 200 accounts starting from 0x00...
We then check if that range can be proven against the main account trie.
- If that is OK, then we know the entire account-set in that range is correct. We then have to still check the storage of each of those.
- If the check is not ok, then we have some iteration to do. We iterate the trie over that range, and for each leaf:
- If the leaf key/value is identical to the snap-data, leave it (untouched) (but also check the storage)
- If the leaf key is identical, but value differs, update the data (updated) (and check the storage)
- If the key did not exist, write it to the snapshot (created). (and check the storage).
- And once the iteration is done, we remove any straggling elements that were in the snapshot.

The same algo is used for storage tries. The idea being that most 99% of all ranges are fine, and that out of the ranges which are not fine, most individual elements are fine.
This improves the IO performance of generation by orders of magnitude.

Some other optimizations:

For small storage tries (smaller than 1024), where we can load the entire range into memory, we can do the verification against the expected root
using the stackTrie. Just feed everything in there, and have it spit out the root, which we can then compare. This means that we don't have to resolve
the trie for that storage root at all.

Experimental results (mainnet)

INFO [03-17|20:43:10.408] Generated state snapshot                 accounts=122184516 slots=422055852 storage=42.15GiB   elapsed=6h48m25.947s

During the process, it read 1.5TB from disk/leveldb (same same). It wrote 22GB in the first 15 minutes (compaction?), and ended up totalling 24Gb leveldb writes, 35Gb disk writes.

core/state/snapshot/generate.go

holiman · 2021-03-16T18:37:18Z

I agree that it doesn't make a whole lot of difference to use stackTrie for small tries (although, there's still some difference), but the even larger upside is that it saves us from even resolving the trie in the first place.

core/state/snapshot/generate.go

eth/protocols/snap/sync.go

holiman · 2021-03-19T12:27:40Z

Generation completed on bench04. Sub nine hours, while also importing blocks:

Generated state snapshot accounts=122401787 slots=422712064 storage=42.21GiB elapsed=8h31m47.746s

holiman · 2021-03-24T15:53:10Z

This one now needs a rebase

holiman · 2021-03-24T17:48:01Z

Rebased!

core/state/snapshot/generate.go

karalabe · 2021-03-26T10:05:57Z

core/state/snapshot/generate.go

-		data := SlimAccountRLP(acc.Nonce, acc.Balance, acc.Root, acc.CodeHash)
-
 		// If the account is not yet in-progress, write it out
 		if accMarker == nil || !bytes.Equal(accountHash[:], accMarker) {


I need to think about this a bit wrt continuations if the account changes while the generator is suspended.

core/state/snapshot/generate.go

karalabe · 2021-03-26T10:24:39Z

core/state/snapshot/generate.go

-	accTrie, err := trie.NewSecure(dl.root, dl.triedb)
-	if err != nil {
-		// The account trie is missing (GC), surf the chain until one becomes available
-		stats.Log("Trie missing, state snapshotting paused", dl.root, dl.genMarker)


We need to ensure this scenario is handled correctly (Geth is restarted and the pruned tries are missing). Your code might still handle it correctly, but I think it's useful to have a more user friendly error message for it (also separates the expected missing root error from unexpected missing trie node errors).

Yes this scenario is still handled. In the new generator, the error returned by generateRange will abort the generation and wait the external signal to resume the generation.

There are two generateRange called, one for the account and another for the storages. The error returned by storage's generateRange will just be propagated to the outer call stack. So eventually all the internal errors will be handled by account's generateRange.

// The procedure it aborted, either by external signal or internal error if err != nil { if abort == nil { // aborted by internal error, wait the signal abort = <-dl.genAbort } abort <- stats return }

core/state/snapshot/generate.go

fxfactorial · 2021-04-08T15:10:05Z

core/state/snapshot/generate.go

not sure - can .Value be called twice?

eth/protocols/snap/sync.go

karalabe

WCGW

karalabe · 2021-04-14T17:45:48Z

core/state/snapshot/snapshot.go

 			// If the base layer is generating, abort it and save
 			if layer.genAbort != nil {
-				abort := make(chan *generatorStats)
+				abort := make(chan *generatorStats, 1) // Discard the stats


Is this a good idea? Previously we waited until the running generator actually stopped. Here we only wait until it gets its signal and immediately go ahead executing before the generator had a chance to stop. Why was the previous one bad?

I.e. Discard the stats is fine, but "not even wait for the stats" is a different thing

Namely this path https://github.com/ethereum/go-ethereum/pull/22504/files#diff-83ae0dd12996e07452e863cf8b072366443d6c6fbb0fff341399c9c8ce5fe050R551 will do a whole lot of shuffling and locking before it returns.

chfast · 2021-04-15T10:18:12Z

This version of geth has freezed for me. I only have basic logs, see the ^C and dates when I restarted it.

INFO [04-07|09:46:14.722] Unindexed transactions                   blocks=1        txs=11        tail=9841444  elapsed="254.679µs"
INFO [04-07|09:46:21.646] Imported new chain segment               blocks=1        txs=218       mgas=12.456  elapsed=109.213ms    mgasps=114.055 number=12191444 hash="
18c101…29066c" dirty=1007.32MiB
INFO [04-07|09:46:21.648] Unindexed transactions                   blocks=1        txs=131       tail=9841445  elapsed=1.364ms
INFO [04-07|09:46:26.972] Imported new chain segment               blocks=1        txs=472       mgas=12.480  elapsed=261.418ms    mgasps=47.741  number=12191445 hash="
eadbb5…fa28b8" dirty=1008.42MiB
INFO [04-07|09:46:26.973] Unindexed transactions                   blocks=1        txs=148       tail=9841446  elapsed=1.403ms
INFO [04-07|09:46:28.771] Imported new chain segment               blocks=1        txs=162       mgas=12.487  elapsed=105.096ms    mgasps=118.812 number=12191446 hash="
21aa9d…510d1c" dirty=1007.27MiB
INFO [04-07|09:46:28.772] Unindexed transactions                   blocks=1        txs=66        tail=9841447  elapsed="642.044µs"
INFO [04-07|09:46:30.545] Imported new chain segment               blocks=1        txs=172       mgas=12.470  elapsed=91.096ms     mgasps=136.891 number=12191447 hash="
2de0d0…c97558" dirty=1007.19MiB
INFO [04-07|09:46:30.546] Unindexed transactions                   blocks=1        txs=76        tail=9841448  elapsed="855.324µs"
INFO [04-07|09:46:30.797] Imported new chain segment               blocks=1        txs=172       mgas=12.470  elapsed=61.501ms     mgasps=202.766 number=12191447 hash="
72994f…ace3c2" dirty=1007.19MiB
INFO [04-07|09:46:55.150] Imported new chain segment               blocks=1        txs=267       mgas=12.487  elapsed=133.402ms    mgasps=93.601  number=12191448 hash="
63db49…63f2b3" dirty=1007.56MiB
INFO [04-07|09:46:55.152] Unindexed transactions                   blocks=1        txs=162       tail=9841449  elapsed=1.343ms
INFO [04-07|09:47:03.588] Imported new chain segment               blocks=1        txs=169       mgas=12.492  elapsed=109.627ms    mgasps=113.952 number=12191449 hash="
c1ca0e…b67b04" dirty=1007.84MiB
INFO [04-07|09:47:03.589] Unindexed transactions                   blocks=1        txs=72        tail=9841450  elapsed="836.354µs"
INFO [04-07|09:47:03.876] Imported new chain segment               blocks=1        txs=258       mgas=12.489  elapsed=98.781ms     mgasps=126.431 number=12191449 hash="
3b29a9…9ff15e" dirty=1008.28MiB
^CINFO [04-07|09:47:12.546] Deep froze chain segment                 blocks=7        elapsed=240.836ms    number=12101449 hash="cacbce…379ebf"
INFO [04-07|09:47:35.850] Chain reorg detected                     number=12191448 hash="63db49…63f2b3" drop=1 dropfrom="c1ca0e…b67b04" add=2 addfrom="fbe37a…d3ebe1"
INFO [04-07|09:47:42.086] Downloader queue stats                   receiptTasks=0     blockTasks=0     itemSize=98.74KiB  throttle=664
ERROR[04-07|10:05:04.039] Snapshot extension registration failed   peer=99619233 err="peer connected on snap without compatible eth support"
INFO [04-07|10:17:42.031] Writing clean trie cache to disk         path=/home/chfast/.ethereum/geth/triecache threads=1
INFO [04-07|10:17:42.033] Regenerated local transaction journal    transactions=0 accounts=0
ERROR[04-09|15:07:43.043] Snapshot extension registration failed   peer=3af25a30 err="peer connected on snap without compatible eth support"
INFO [04-09|15:36:51.344] New local node record                    seq=145 id=455fea992e705d0d ip=127.0.0.1   udp=30303 tcp=30303
INFO [04-15|12:07:20.168] Got interrupt, shutting down... 
INFO [04-15|12:07:20.184] IPC endpoint closed                      url=/home/chfast/.ethereum/geth.ipc
INFO [04-15|12:07:20.202] Imported new chain segment               blocks=1        txs=313       mgas=12.458  elapsed=194h19m44.492s mgasps=0.000   number=12191450 hash
="fbe37a…d3ebe1" age=1w1d2h    dirty=1008.32MiB
INFO [04-15|12:07:20.204] Downloader queue stats                   receiptTasks=0     blockTasks=0     itemSize=98.74KiB  throttle=664
INFO [04-15|12:07:20.223] Unindexed transactions                   blocks=1        txs=53        tail=9841451  elapsed=19.721ms
INFO [04-15|12:07:20.226] Regenerated local transaction journal    transactions=0 accounts=0
INFO [04-15|12:07:20.232] Ethereum protocol stopped
INFO [04-15|12:07:20.339] Transaction pool stopped 
ERROR[04-15|12:07:22.167] Snapshot extension registration failed   peer=35ba3061 err="peer connected on snap without compatible eth support"
INFO [04-15|12:07:24.389] Persisted the clean trie cache           path=/home/chfast/.ethereum/geth/triecache elapsed=4.220s
INFO [04-15|12:07:24.465] Writing cached state to disk             block=12191450 hash="fbe37a…d3ebe1" root="a5e7bc…fa3b3d"
INFO [04-15|12:07:30.198] Looking for peers                        peercount=0 tried=48 static=0
INFO [04-15|12:07:40.223] Looking for peers                        peercount=0 tried=65 static=0
INFO [04-15|12:07:50.251] Looking for peers                        peercount=1 tried=59 static=0
INFO [04-15|12:07:55.579] Persisted trie from memory database      nodes=2333036           size=598.89MiB time=31.114416674s gcnodes=8371981 gcsize=3.16GiB gctime=31.96
1939672s livenodes=406010 livesize=156.47MiB
INFO [04-15|12:07:55.596] Writing cached state to disk             block=12191449 hash="3b29a9…9ff15e" root="07c4c4…d7575c"
INFO [04-15|12:07:55.647] Persisted trie from memory database      nodes=3204              size=1.29MiB   time=50.705232ms   gcnodes=0       gcsize=0.00B   gctime=0s   
         livenodes=402806 livesize=155.19MiB
INFO [04-15|12:07:55.647] Writing cached state to disk             block=12191323 hash="1f0dc1…e0c591" root="6234d2…885a34"
INFO [04-15|12:07:57.505] Persisted trie from memory database      nodes=122409            size=45.07MiB  time=1.857697633s  gcnodes=0       gcsize=0.00B   gctime=0s   
         livenodes=280397 livesize=110.12MiB
INFO [04-15|12:07:57.505] Writing snapshot state to disk           root="c7166f…ce5918"
INFO [04-15|12:07:57.505] Persisted trie from memory database      nodes=0                 size=0.00B     time="4.806µs"     gcnodes=0       gcsize=0.00B   gctime=0s   
         livenodes=280397 livesize=110.12MiB
INFO [04-15|12:07:58.704] New local node record                    seq=146 id=455fea992e705d0d ip=5.226.70.43 udp=30303 tcp=30303
INFO [04-15|12:07:58.709] Writing clean trie cache to disk         path=/home/chfast/.ethereum/geth/triecache threads=8
INFO [04-15|12:07:59.727] Persisted the clean trie cache           path=/home/chfast/.ethereum/geth/triecache elapsed=1.017s
INFO [04-15|12:07:59.727] Blockchain stopped

rjl493456442 requested review from holiman and karalabe as code owners March 16, 2021 12:03

holiman reviewed Mar 16, 2021

View reviewed changes

core/state/snapshot/generate.go Outdated Show resolved Hide resolved

holiman force-pushed the fill-snap-exp branch from e2db888 to f2701c8 Compare March 16, 2021 18:35

rjl493456442 commented Mar 18, 2021

View reviewed changes

core/state/snapshot/generate.go Outdated Show resolved Hide resolved

rjl493456442 commented Mar 18, 2021

View reviewed changes

core/state/snapshot/generate.go Outdated Show resolved Hide resolved

holiman force-pushed the fill-snap-exp branch from af08a1f to d52350a Compare March 18, 2021 08:18

ethereum deleted a comment from samlavery Mar 18, 2021

holiman mentioned this pull request Mar 18, 2021

[WIP]: improve snapshot generation #22465

Closed

holiman reviewed Mar 19, 2021

View reviewed changes

eth/protocols/snap/sync.go Outdated Show resolved Hide resolved

holiman reviewed Mar 19, 2021

View reviewed changes

eth/protocols/snap/sync.go Outdated Show resolved Hide resolved

holiman force-pushed the fill-snap-exp branch from bf0c6aa to 23aefe3 Compare March 24, 2021 17:47

karalabe reviewed Mar 26, 2021

View reviewed changes

core/state/snapshot/generate.go Show resolved Hide resolved

karalabe reviewed Mar 26, 2021

View reviewed changes

core/state/snapshot/generate.go Outdated Show resolved Hide resolved

karalabe reviewed Mar 26, 2021

View reviewed changes

core/state/snapshot/generate.go Show resolved Hide resolved

karalabe reviewed Mar 26, 2021

View reviewed changes

core/state/snapshot/generate.go Show resolved Hide resolved

karalabe reviewed Mar 26, 2021

View reviewed changes

holiman reviewed Mar 26, 2021

View reviewed changes

core/state/snapshot/generate.go Outdated Show resolved Hide resolved

holiman reviewed Mar 26, 2021

View reviewed changes

core/state/snapshot/generate.go Outdated Show resolved Hide resolved

karalabe reviewed Mar 26, 2021

View reviewed changes

core/state/snapshot/generate.go Outdated Show resolved Hide resolved

karalabe reviewed Mar 26, 2021

View reviewed changes

holiman reviewed Mar 26, 2021

View reviewed changes

core/state/snapshot/generate.go Outdated Show resolved Hide resolved

holiman changed the title ~~[WIP] snapshot generation~~ core/state/snapshot: faster snapshot generation Mar 30, 2021

fxfactorial reviewed Apr 8, 2021

View reviewed changes

rjl493456442 added 2 commits April 9, 2021 16:30

eth/protocols: persist received state segments

4c2c35f

core: initial implementation

4a7650f

eth/protocols/snap: improved log message

920e2eb

karalabe reviewed Apr 14, 2021

View reviewed changes

eth/protocols/snap/sync.go Outdated Show resolved Hide resolved

eth/protocols/snap: fix heal logs to condense infos

fa8b467

karalabe added this to the 1.10.3 milestone Apr 14, 2021

karalabe approved these changes Apr 14, 2021

View reviewed changes

karalabe reviewed Apr 14, 2021

View reviewed changes

karalabe added 2 commits April 14, 2021 21:35

core/state/snapshot: wait for generator termination before restarting

3a54830

core/state/snapshot: revert timers to counters to track total time

12a1a85

karalabe force-pushed the fill-snap-exp branch from f8d9cf7 to 12a1a85 Compare April 14, 2021 19:47

karalabe merged commit 7088f1e into ethereum:master Apr 14, 2021

rjl493456442 mentioned this pull request Apr 26, 2021

How much storage does the snapshot have？ #22731

Closed

mevalphaleak mentioned this pull request Jul 23, 2021

Severe performance degradation on archive node after #22504 #23262

Closed

quorumbot mentioned this pull request Jul 14, 2022

[Upgrade] Go-Ethereum release v1.10.3 Consensys/quorum#1449

Closed

9 tasks

quorumbot mentioned this pull request Aug 5, 2022

[Upgrade] Go-Ethereum release v1.10.3 Consensys/quorum#1469

Merged

9 tasks

This was referenced Sep 23, 2022

Metadium to master METADIUM/go-metadium#24

Closed

Metadium to master METADIUM/go-metadium#25

Merged

gzliudan mentioned this pull request May 29, 2025

trie: faster snapshot generation #22504 XinFinOrg/XDPoSChain#1062

Merged

19 tasks

core/state/snapshot: faster snapshot generation #22504

core/state/snapshot: faster snapshot generation #22504

Uh oh!

Conversation

rjl493456442 commented Mar 16, 2021 • edited by holiman Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How snapshot generation works now

With this PR

Uh oh!

Uh oh!

holiman commented Mar 16, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

holiman commented Mar 19, 2021

Uh oh!

holiman commented Mar 24, 2021

Uh oh!

holiman commented Mar 24, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

karalabe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chfast commented Apr 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

rjl493456442 commented Mar 16, 2021 •

edited by holiman

Loading