Ethereum clients initial sync benchmark

Parity Ethereum vs Geth initial sync benchmark

ethereum

Author

banteg

Published

November 7, 2018

Syncing Ethereum blockchain head-on by fetching every block and replaying every transaction is impractical. That’s why different full nodes utilize clever techniques to speed up the initial sync and get up to speed in a reasonable time.

According to Ethernodes, Etherum mainnet is dominated by Geth (55.7%), Parity Ethereum (31.5%), so we’ll discuss these two. All tests are done in October 2018 on a $30/mo dedicated server with the following config:

2.4 GHz CPU with 8 cores
16 GB of RAM
250 GB SSD
1 Gbit network

Geth

Geth (also known as go-ethereum) is written in golang and uses Google’s LevelDB as a storage backend. The default sync mode is called fast, it downloads all the needed data in parallel:

Block headers (6,475,108 headers took 13 hours 25 minutes)
Transaction receipts (same amount, finished roughly at the same time)
State tries (syncing 227.8 million states took 1 day 10 hours)

A few hours into the sync, a message saying “Database compacting, degraded performance” starts to appear. The first one is at 4 hours 13 minutes mark and it gets worse after that. Geth ends up spending 4 hours 19 minutes (¹⁄₈ of the total time) completely halted by LevelDB compaction. After headers and receipts have finished downloaded, state sync finished without interruptions.

A few minutes after the last state entry was downloaded, the node appeared fully synced.

The final time is 1 day 10 hours. The database size is 130.9 GB.

Unfortunately, Geth’s eth_syncing command doesn’t provide enough meaningful info to calculate the remaining time. It would be great to know how many states are there to sync, but traversing the whole state trie is probably a very expensive operation.

Parity Ethereum

Parity Ethereum (formerly Parity) is written in Rust and uses Facebook’s RocksDB as a storage backend. It’s backwards compatible with LevelDB, and has a bunch of additional features.

Parity syncs in warp mode by default, which works like this:

Pull a recent snapshot (2209 chunks took 2 hours 20 minutes)
Sync the few remaining blocks (4651 blocks took 1 hour 33 minutes)
Process older blocks in the background (6,500,000 blocks in 1 day 2 hours 39 minuts)

This makes the node usable after about 4 hours. You can interact with the blockchain and query the recent state and blocks that came after the snapshot with the older blocks unavailable until they are processed.

The final time is 1 day 6 hours 32 minutes. The database size is 125.6 GB.

Parity’s eth_syncing reports the total number of warp chunks as well as the total number of blocks, which allows to estimate the remaining time. It doesn’t report the status of the third step, but it can be monitored with a simple script like this:

from web3.auto import w3

for n in range(0, w3.eth.blockNumber, 10000):
    while not w3.eth.getBlock(n):
        sleep(60)
    print(n)

If you want to get a deeper understanding of these challenges, I recommend reading the TurboSync paper.

Hope you learned something useful. Subscribe to crypto eli5.