Author: Luo Benben, former Arbitrum technical ambassador, geek web3 contributor
This article is written by Luo Benben, former Arbitrum technical ambassador and former co-founder of smart contract automation audit company Goplus Security Technical interpretation of Arbitrum One.
Because there is a lack of professional interpretation of Arbitrum and even OP Rollup in the articles or materials related to Layer 2 in the Chinese circle, this article attempts to fill this field by popularizing the operating mechanism of Arbitrum. of vacancies. Because the structure of Arbitrum itself is too complex, the full article exceeds 10,000 words even though it is as simplified as possible, so it is divided into two parts. It is recommended to collect and forward it as a reference!
Rollup sorter simplified The description
The principle of Rollup expansion can be summarized into two points:
Cost optimization :Transfer most of the computing and storage tasks to the L1 chain, that is, L2. L2 is mostly a chain running on a single server, that is, the sequencer/operator.
The sorter looks and feels close to a centralized server, giving up "decentralization" in the "impossible three aspects of blockchain" in exchange for TPS and cost advantages. . Users can let L2 process transaction instructions instead of Ethereum, and the cost is much lower than trading on Ethereum.
(Source: BNB Chain)
Security guarantee: The transaction content and post-transaction status on L2 will be synchronized to Ethereum L1 , verify the validity of state transition through the contract. At the same time, the historical records of L2 will be retained on Ethereum. Even if the sequencer is permanently down, others can restore the entire L2 state through the records on Ethereum.
Fundamentally, Rollup’s security is based on Ethereum. If the sequencer does not know the private key of an account, it cannot initiate transactions in the name of the account, or cannot tamper with the asset balance of the account (even if it does, it will be quickly discovered).
Although the sorter is centralized as the system hub, in the relatively mature Rollup solution, the centralized sorter can only implement soft evil activities such as transaction review. However, in an ideal Rollup solution, there are corresponding means to contain it (such as anti-censorship mechanisms such as forced withdrawals or sorting proofs).
(The forced withdrawal function set by the Loopring protocol in the contract source code on L1 for users to call)
To prevent Rollup sorting The status verification methods for malicious actors are divided into two categories: Fraud Proof and Validity Proof. The Rollup scheme using fraud proof is called OP Rollup (Optimistic Rollup, OPR), and because of some historical baggage, the Rollup scheme using validity proof is often called ZK Rollup (Zero-knowledge Proof Rollup, ZKR) instead. Validity Rollup.
Arbitrum One is a typical OPR. It deploys contracts on L1 and does not actively verify the submitted data. It is optimistic that there is no problem with the data. If the submitted data is incorrect, the L2 verifier node will actively initiate a challenge.
Therefore, OPR also implies a trust assumption: there is at least one honest L2 verifier node at any time. ZKR’s contract uses cryptographic calculations to proactively but cost-effectively verify the data submitted by the sequencer.
(Optimistic Rollup operation mode)
< /p>
(ZK Rollup operation mode)
This article will introduce it in depth Arbitrum One, the leading project in Optimistic Rollup, covers all aspects of the entire system. After reading it carefully, you will have a deep understanding of Arbitrum and Optimistic Rollup/OPR.
Arbitrum’s core components and workflow
Core contract: h3>
Arbitrum’s most important contracts includeSequencerInbox, DelayedInbox, L1 Gateways, L2 Gateways, Outbox, RollupCore, Bridge, etc. More details will be introduced later.
Sequencer:
Receives and sorts user transactions, calculates transaction results, and quickly (usually <1s) Return a receipt to the user. Users can often see their transactions listed on L2 within a few seconds, and the experience is just like the Web2 platform.
At the same time, the sequencer will also broadcast the latest L2 Block immediately under the Ethereum chain, and any Layer2 node can receive it asynchronously. But at this time, these L2 Blocks are not final and can be rolled back by the sequencer.
Every few minutes, the sorter will compress the sorted L2 transaction data, aggregate it into a batch, and submit it To the inbox contract SequencerInbox on Layer1to ensure data availability and the operation of the Rollup protocol. Generally speaking, L2 data submitted to Layer1 cannot be rolled back and can be final.
From the above process we can summarize: Layer2 has its own node network, but the number of these nodes is sparse, and there is generally no consensus protocol commonly used in public chains, so the security is very poor. It must rely on Ethereum to ensure the reliability of data release and the effectiveness of state transitions.
Arbitrum Rollup protocol:
Defines the structure of the block RBlock of the Rollup chain, the chain continuation method, the release of RBlock, and a series of contracts such as the challenge mode process. Note that the Rollup chain mentioned here is not the Layer 2 ledger that everyone understands, but an abstract "chain-like data structure" independently set up by Arbitrum One in order to implement the fraud proof mechanism.
One RBlock can contain the results of multiple L2 blocks, and the data is also very different. Its data entity RBlock is stored in a series of contracts in RollupCore. If there is a problem with an RBlock, the Validator will challenge the submitter of the RBlock.
Validator:
Arbitrum’s validator nodes are actually a special subset of Layer 2 full nodes, and currently have whitelist access.
Validator creates a new RBlock based on the transaction batch submitted by the sequencer to the SequencerInbox contract< /strong>(Rollup block, also called assertion),and monitor the status of the current Rollup chain, and challenge the incorrect data submitted by the sequencer.
Active Validator needs to pledge assets on the ETH chain in advance, sometimes we also call it Staker. Although Layer 2 nodes that do not pledge can also monitor the operation dynamics of Rollup and send abnormal alarms to users, they cannot directly intervene on the error data submitted by the sequencer on the ETH chain.
Challenge:
The basic steps can be summarized as multiple rounds of interactive subdivision and single-step proof. In the segmentation process, the challenging parties first conduct multiple rounds of segmentation on the problematic transaction data until they decompose the problematic operation code instructions and conduct verification. The paradigm of "multi-round subdivision-single-step proof" is considered by Arbitrum developers to be the most gas-saving implementation of fraud proof. All links are under contract control, and no party can cheat.
Challenge period:
Due to the optimistic nature of OP Rollup, after each RBlock is submitted to the chain, the contract does not actively check, and pre- Leave a window of time for the verifier to falsify. This time window is the challenge period, which is 1 week on the Arbitrum One main network. After the challenge period ends, the RBlock will be finally confirmed, and the corresponding messages from L2 to L1 within the block (such as withdrawal operations performed through the official bridge) can be released.
ArbOS, Geth, WAVM:
The virtual machine used by Arbitrum is called AVM, including Geth and ArbOS Two parts. Geth is the most commonly used client software in Ethereum, and Arbitrum has made lightweight modifications to it. ArbOS is responsible for all L2-related special functions, such as network resource management, generating L2 blocks, working with EVM, etc. We regard the combination of the two as a Native AVM, which is the virtual machine used by Arbitrum. WAVM is the result of compiling AVM code into Wasm. In the Arbitrum challenge process, the last "single-step proof" verifies the WAVM instruction.
Here, we can use the following figure to represent the relationship and workflow between the above components:
L2 transaction life cycle
One The processing flow of an L2 transaction is as follows:
1.The user sends transaction instructions to the sequencer.
2.The sorter first verifies the transactions to be processed into digital signatures and other data, eliminates invalid transactions, and performs sorting and calculations.
3.The sequencer sends the transaction receipt to the user (usually very fast), but this is only the sequencer under the ETH chain The "preprocessing" performed is in the state of Soft Finality and is not reliable. But for users who trust the sequencer (most users), they can be optimistic that the transaction has been completed and will not be rolled back.
4.The sorter highly compresses the preprocessed original transaction data and encapsulates it into a Batch.
5.Every once in a while (affected by factors such as data volume, ETH congestion, etc.), the sequencer will send data to the Sequencer on L1. Inbox contract publishes transaction batch. At this point, it can be considered that the transaction has Hard Finality.
Sequencer Inbox Contract< /h3>
The contract will receive the transaction batch submitted by the sequencer to ensure data availability. Taking a closer look, the batch data in SequencerInbox completely records the transaction input information of Layer 2. Even if the sequencer is permanently down, anyone can restore the current state of Layer 2 based on the batch record and take over the faulty/runaway sorting. device.
Using a physical approach, the L2 we see is just the projection of the batch in SequencerInbox, and the light source is STF. Because the light source STF does not change easily, the shape of the shadow is only determined by the batch acting as the object.
The Sequencer Inbox contract is called a fast box. The sequencer specifically submits preprocessed transactions to it. And only the sequencer Data can be submitted to it. The corresponding fast box is the slow box Delayer Inbox, its function will be described in the subsequent process.
Validator will always monitor the SequencerInbox contract. Whenever the sequencer releases a Batch to the contract, an on-chain event will be thrown. After the Validator listens to the occurrence of this event, It will download the batch data, and after executing it locally, issue RBlock to the Rollup protocol contract on the ETH chain.
There is a function called accumulation in Arbitrum’s bridge contract The parameters of the accumulator will be recorded for the newly submitted L2 batch, as well as the number and information of newly received transactions on the slow Inbox.
(The sequencer continuously submits batches to SequencerInbox)
(Batch specific information, the data field corresponds to Batch data, the size of this part of the data is very large, and the screenshot is not fully displayed)
The SequencerInbox contract has two main functions:
add Sequencer L2Batch From Origin(),The sequencer will call this function every time to submit Batch data to the Sequencer Inox contract.
force Inclusion(), This function can be called by anyone and is used to implement censorship-resistant transactions. The way this function takes effect will be explained in detail later when we talk about the Delayed Inbox contract.
The above two functions will call bridge.enqueueSequencerMessage() to update the accumulator parameter accumulator in the bridge contract.
Gas Pricing
Obviously, L2 transactions cannot be free, because this will lead to DoS attacks. In addition, there is the operating cost of the sorter L2 itself, as well as the overhead of submitting data on L1. When a user initiates a transaction within the Layer2 network, thegas fee structure is as follows:
The data publishing cost incurred by occupying Layer1 resources,< /strong>Mainly comes from batches submitted by the sequencer (each batch has many user transactions), and the cost is ultimately shared equally among the transaction initiators. The fee pricing algorithm generated by data release is dynamic, and the sorter will price based on the recent profit and loss status, batch size, and current Ethereum gas price.
The cost incurred by users due to occupying Layer 2 resources sets a gas limit per second that can ensure the stable operation of the system (currently Arbitrum One is 7 million). The gas guide prices of L1 and L2 are tracked and adjusted by ArbOS, and the formulas will not be described here for the time being.
Although the specific gas price calculation process is relatively complicated, users do not need to be aware of it. Looking at these details, it can be clearly felt that Rollup transaction fees are much cheaper than those on the ETH mainnet.
Optimistic fraud proof
Recalling the above, L2 is actually just the sorter in the fast box The projection of the transaction input batch submitted in , that is:
Transaction Inputs -> STF -> State Outputs. The input has been determined, the STF is unchanged, and the output result is also determined. The system of fraud proof and Arbitrum Rollup protocol is to publish the output state root to L1 in the form of RBlock (aka assertion) and A system for optimistic proofs.
On L1, there are input data published by the sequencer and output status published by the verifier. Let’s consider it carefully, is it necessary to publish the status of Layer 2 to the chain?
Because the input has completely determined the output, and the input data is publicly visible, submitting the output result-status seems redundant? But this idea ignores the actual need for state settlement between the two systems L1-L2, that is, the withdrawal behavior of L2 to L1 requires proof of the state.
When building Rollup, one of the core ideas is to put most of the computing and storage on L2 to avoid the high cost of L1, which means Therefore, L1 does not know the status of L2. It only helps the L2 sorter to publish the input data of all transactions, but is not responsible for calculating the status of L2.
The withdrawal behavior is essentially based on the cross-chain message given by L2, unlocking the corresponding funds from the L1 contract and transferring them to the user's L1 account or accomplish other things.
At this time, the Layer1 contract will ask: What is your status on Layer2, and how to prove that you really have these statements to cross? assets. At this time, the user must provide the corresponding Merkle Proof, etc.
So, if we build a Rollup without a withdrawal function, it is theoretically possible not to synchronize the state to L1, and there is no need for a state proof system such as fraud proof (although it may cause other problems ). But in real applications, this is obviously not feasible.
In the so-called optimistic proof, the contract does not check whether the output status submitted to L1 is correct, and optimistically believes that everything is accurate. The optimistic proof system will assume that there is at least one honest Validator at any time. If an incorrect state occurs, it will be challenged through a fraud proof.
The advantage of this design is that there is no need to actively verify every RBlock issued to L1 to avoid wasting gas. In fact, for OPR, it is unrealistic to verify every assertion, because each Rblock contains one or more L2 blocks, and each transaction must be re-executed on L1. It is no different from executing L2 transactions directly on L1, which loses the meaning of Layer 2 expansion.
ZKR does not have this problem, because ZK Proof is concise. It only needs to verify a very small Proof, and there is no need to actually execute many of the corresponding steps behind the Proof. transactions. Therefore, ZKR does not operate optimistically. Every time the status is released, there will be a Verfier contract for mathematical verification.
Although the fraud proof cannot be as concise as the zero-knowledge proof, Arbitrum uses a "multi-round split-single-step proof" round-robin method In the interactive process, what ultimately needs to be proved is only a single virtual machine operation code, and the cost is relatively small.
Rollup Protocol
Let’s first take a look at the entrance to initiate challenges and start proofs , that is, how the Rollup protocol works.
The core contract of the Rollup protocol is RollupProxy.sol. While ensuring consistent data structure, a rare dual agent structure is used, one agent corresponds to The two implementations, RollupUserLogic.sol and RollupAdminLogic.sol, cannot currently be parsed well in tools such as Scan.
In addition, there is the ChallengeManager.sol contract responsible for managing challenges, and the OneStepProver series of contracts to determine fraud proofs.
(Source: L2BEAT official website)
In RollupProxy, records are processed by different Validators A series of submitted RBlocks (aka assertions), i.e. the boxes in the picture below: green - confirmed, blue - unconfirmed, yellow - falsified.
RBlock contains The final state after the execution of one or more L2 blocks since the last RBlock. These RBlocks form a formal Rollup Chain (note that the L2 ledger itself is different). Under optimistic circumstances, this Rollup Chain should have no forks, because a fork means that a Validator has submitted conflicting Rollup Blocks.
To propose or agree with an assertion, the verifier needs to pledge a certain amount of ETH for the assertion and become a Staker. In this way, when a challenge/fraud proof occurs, the loser's collateral will be forfeited. This is the economic basis to ensure the honest behavior of the verifier.
The blue block No. 111 in the lower right corner of the picture will eventually be falsified because its parent block No. 104 is wrong (yellow).
In addition, verifier A proposed Rollup Block No. 106, but B disagreed and challenged it.
Challenge at B Finally, the ChallengeManager contract is responsible for verifying the subdivision process of the challenge steps:
1. Subdivision is a process in which both parties take turns to interact. The historical data contained in each Rollup Block is segmented, and the other party points out which part of the data fragment is problematic. A process similar to the dichotomy (actually N/K) that continuously and gradually narrows the scope.
2. After that, you can continue to locate the transaction and the result that are problematic, and then further subdivide it into a disputed machine instruction in the transaction.
3. The ChallengeManager contract only checks whether the "data fragments" generated after segmenting the original data are valid.
4. When the challenger and the challengeee locate the machine instruction that will be challenged, the challenger calls oneStepProveExecution() to send the order A step-by-step fraud proof proves that there is something wrong with the execution result of this machine instruction.
Single-step proof h3>
Single-step proof is the core of the entire Arbitrum fraud proof. Let’s take a look at what the single-step proof specifically proves.
This requires understanding WAVM first, Wasm Arbitrum Virtual Machine, which is a virtual machine compiled by the ArbOS module and the Geth (Ethereum client) core module. Because L2 is very different from L1 in many places, the original Geth core must be lightly modified and work together with ArbOS.
So, the state transition on L2 is actually the joint work of ArbOS+Geth Core.
Arbitrum The node client (sequencer, validator, full node, etc.) compiles the above-mentioned ArbOS+Geth Core processing program into native machine code that the node host can directly process (for x86/ARM/PC/Mac/etc. ).
If you change the target language obtained after compilation to Wasm, you will get the WAVM used by the verifier when generating the fraud proof, and the contract for verifying the single-step proof , which also simulates the functions of the WAVM virtual machine.
Then why does it need to be compiled into Wasm bytecode when generating a fraud proof? The main reason is that to verify the contract of single-step fraud proof, it is necessary to use the Ethereum smart contract to simulate a virtual machine VM that can process a certain set of instructions, and WASM is easy to implement simulation on the contract.
But compared to Native machine code, WASM runs faster Slightly slower, so Arbitrum’s nodes/contracts will use WAVM only when generating and verifying fraud proofs.
After the previous rounds of interactive subdivisions, the single-step proof finally proved the single-step instruction in the WAVM instruction set.
As you can see from the code below, OneStepProofEntry must first determine which category the operation code of the instruction to be proven belongs to, and then call the corresponding prover such as Mem, Math, etc. Pass the single-step instructions into the prover contract.
The final result afterHash will be returned to ChallengeManager. If the hash If the hash after the instruction operation is inconsistent with the hash recorded on the Rollup Block, the challenge is successful. If they are consistent, it means that there is no problem with the execution result of this command recorded on the Rollup Block, and the challenge failed.
In the next article, we will analyze Arbitrum and even Layer2 and Layer1 A contract module that handles cross-chain messaging/bridging functions, and further clarifies how a true Layer 2 should achieve censorship resistance.