Author: Wu Yue, Geek web3
As we all know, EVM is positioned as Ethereum's "execution engine" and "smart contract execution environment", and can be said to be one of Ethereum's most important core components. The public chain is an open network with thousands of nodes, and the hardware parameters of different nodes vary greatly. If you want smart contracts to run the same results on multiple nodes and meet "consistency", you must try to build the same environment on different devices, and virtual machines can achieve this effect.
Ethereum's virtual machine EVM can run smart contracts in the same way on different operating systems (such as Windows, Linux, macOS) and devices. This cross-platform compatibility ensures that each node can get consistent results after running the contract. The most typical example is the Java virtual machine JVM.
The smart contracts we usually see in the block browser are first compiled into EVM bytecodes and then stored on the chain. When executing the contract, EVM directly reads these bytecodes in sequence, and each instruction (opCode) corresponding to the bytecode has a corresponding Gas cost. EVM tracks the Gas consumption of each instruction during execution, and the consumption depends on the complexity of the operation.
In addition, as the core execution engine of Ethereum, EVM uses serial execution to process transactions. All transactions are queued in a single queue and executed in a certain order. The reason why parallelization is not used is that the blockchain must strictly meet consistency. A batch of transactions must be processed in the same order in all nodes. If the transaction processing is parallelized, it is difficult to accurately predict the transaction order unless the corresponding scheduling algorithm is introduced, but this will be more complicated.
Due to time constraints, the founding team of Ethereum in 2014-15 chose the serial execution method because it is simple in design and easy to maintain. However, with the iteration of blockchain technology and the growing user base, blockchain has higher and higher requirements for TPS and throughput. After the emergence and maturity of Rollup technology, the performance bottleneck brought by EVM serial execution has been exposed on Ethereum Layer 2.
As a key component of Layer2, Sequencer undertakes all computing tasks in the form of a single server. If the efficiency of the external modules that cooperate with Sequencer is high enough, the final bottleneck will depend on the efficiency of the Sequencer itself, and serial execution will become a huge obstacle.
The opBNB team has optimized the DA layer and data read and write modules to the extreme, and the Sequencer can execute up to about 2,000 ERC-20 transfers per second. This number seems high, but if the transaction to be processed is much more complicated than the ERC-20 transfer, the TPS value will inevitably be greatly reduced. Therefore, the parallelization of transaction processing will be an inevitable trend in the future.
Below we will start with more specific details to explain in detail the limitations of traditional EVM and the advantages of parallel EVM.
Two core components of Ethereum transaction execution
At the code module level, in addition to EVM, another core component related to transaction execution in go-ethereum is stateDB, which is used to manage account status and data storage in Ethereum. Ethereum uses a tree structure called Merkle Patricia Trie to act as a database index (directory). Each time an EVM executes a transaction, some data stored in stateDB will be changed, and these changes will eventually be reflected in the Merkle Patricia Trie (hereinafter referred to as the global state tree).
Specifically, stateDB is responsible for maintaining the status of all Ethereum accounts, including EOA accounts and contract accounts, and the data it stores includes account balances, smart contract codes, etc. During the transaction execution process, stateDB reads and writes the data of the corresponding account. After the transaction execution is completed, stateDB needs to submit the new status to the underlying database (such as LevelDB) for persistence processing.
In general, EVM is responsible for interpreting and executing smart contract instructions, changing the status on the blockchain according to the calculation results, while stateDB acts as a global state storage to manage the status changes of all accounts and contracts. The two collaborate to build the transaction execution environment of Ethereum.
The specific process of serial execution
There are two types of Ethereum transactions, namely EOA transfers and contract transactions.EOA transfers are the simplest type of transactions, that is, ETH transfers between ordinary accounts. This type of transaction does not involve contract calls and is processed very quickly. Due to the simplicity of operation, the gas fee charged for EOA transfers is extremely low.
Unlike simple EOA transfers, contract transactions involve the calling and execution of smart contracts. When processing contract transactions, EVM must interpret and execute the bytecode instructions in the smart contract one by one. The more complex the logic of the contract, the more instructions are involved and the more resources are consumed.
For example, the processing time of ERC-20 transfer is about twice that of EOA transfer, and for more complex smart contracts, such as trading operations on Uniswap, it takes longer, and can even be more than ten times slower than EOA transfer. This is because the DeFi protocol needs to handle complex logic such as liquidity pools, price calculations, token swaps, etc. during transactions, which requires very complex calculations.
So in serial execution mode, how do the two components EVM and stateDB collaborate to process transactions?
In the design of Ethereum, transactions within a block are processed one by one in order, and each transaction (tx) will have an independent instance to perform the specific operations of the transaction. Although each transaction uses a different EVM instance, all transactions share the same state database, that is, stateDB.
During the transaction execution process, EVM needs to continuously interact with stateDB, read relevant data from stateDB, and write the changed data back to stateDB.
Let's take a look at how EVM and stateDB collaborate to execute transactions from a code perspective:
1. The processBlock() function calls the Process() function to process the transactions contained in a block;
2. A for loop is defined in the Process() function, and you can see that transactions are executed one by one;
3. After all transactions are processed, the processBlock() functioncalls the writeBlockWithState() function, and then calls the statedb.Commit() function to submit the state change result.
When all transactions in a block are executed, the data in stateDB will be committed to the global state tree (Merkle Patricia Trie) mentioned above, and a new state root (stateRoot) will be generated. The state root is an important parameter in each block, which records the "compression result" of the new global state after the block is executed.
It is not difficult to understand that the bottleneck of EVM's serial execution mode is obvious: transactions must be queued and executed in order. If there is a smart contract transaction that takes a long time, other transactions can only wait until it is processed. This obviously cannot fully utilize hardware resources such as CPU, and the efficiency will be greatly limited.
EVM multi-threaded parallel optimization solution
If we use real-life examples to compare serial execution and parallel execution, the former is analogous to a bank with only one counter, and parallel EVM is analogous to a bank with multiple counters. In parallel mode, multiple threads can be opened to process multiple transactions at the same time, and the efficiency can be improved several times, but the tricky part is the state conflict problem.
If multiple transactions all claim to rewrite the data of a certain account, conflicts will arise when they are processed at the same time. For example, only one NFT can be minted, and both transaction 1 and transaction 2 claim to mint the NFT. If their requests are all met, there will obviously be errors. Coordination is required to deal with such situations. State conflicts in actual operations are often more frequent than we mentioned, so if transaction processing is to be parallelized, measures must be taken to deal with state conflicts.
Reddio's parallel optimization principle for EVM
Let's take a look at the parallel optimization ideas of the ZKRollup project Reddio for EVM. Reddio's idea is to assign a transaction to each thread and provide a temporary state database in each thread, called pending-stateDB. The details are as follows:
1. Multithreaded parallel execution of transactions:Reddio sets up multiple threads to process different transactions at the same time, and the threads do not interfere with each other. This can increase the transaction processing speed several times.
2. Allocate a temporary state database for each thread:Reddio allocates an independent temporary state database (pending-stateDB) to each thread. When executing transactions, each thread will not directly modify the global stateDB, but temporarily record the state change results in pending-stateDB.
3. Synchronize state changes:After all transactions in a block are executed, EVM will synchronize the state change results recorded in each pending-stateDB to the global stateDB in sequence. If there is no state conflict between different transactions during execution, the records in pending-stateDB can be merged smoothly into the global stateDB.
Reddio has optimized the way read and write operations are handled to ensure that transactions can correctly access state data and avoid conflicts.
· Read operation:When a transaction needs to read the state, EVM will first check the ReadSet of Pending-state. If the ReadSet shows that the required data exists, EVM reads the data directly from pending-stateDB. If the corresponding key-value (key-value pair) is not found in the ReadSet, the historical state data is read from the global stateDB corresponding to the previous block.
·Write operations:All write operations (i.e., modifications to the state) will not be written directly to the global stateDB, but will be recorded in the WriteSet of Pending-state first. After the transaction is executed, conflict detection will be used to try to merge the state change results into the global stateDB.
The key problem of parallel execution is state conflicts, which is particularly significant when multiple transactions attempt to read and write the state of the same account. To this end, Reddio introduced a conflict detection mechanism:
· Conflict detection: During the transaction execution process, EVM monitors the ReadSet and WriteSet of different transactions. If multiple transactions are found to attempt to read and write the same state item, it is considered a conflict.
· Conflict handling: When a conflict is detected, the conflicting transaction will be marked as needing to be re-executed.
After all transactions are executed, the change records in multiple pending-stateDBs will be merged into the global stateDB. If the merge is successful, EVM will submit the final state to the global state tree and generate a new state root.
The performance improvement of multi-threaded parallel optimization is obvious, especially when dealing with complex smart contract transactions.
According to the research on parallel EVM, in low-conflict workloads (transactions with fewer conflicts or occupying the same resources in the transaction pool), the TPS of the benchmark test is about 3~5 times higher than that of traditional serial execution. In high-conflict workloads, theoretically, if all optimization methods are used, it can even reach 60 times.
Summary
Reddio's EVM multi-threaded parallel optimization solution significantly improves the transaction processing capacity of EVM by allocating a temporary state library for each transaction and executing transactions in parallel in different threads. By optimizing read and write operations and introducing a conflict detection mechanism, the EVM-based public chain can achieve large-scale parallelization of transactions while ensuring state consistency, solving the performance bottleneck caused by the traditional serial execution mode. This lays an important foundation for the future development of Ethereum Rollup.
We will further analyze the implementation details of Reddio in the future, such as how to further improve efficiency by optimizing storage efficiency, optimization solutions when conflicts occur frequently, and how to use GPU for optimization, etc.