Authors: Wei Han Ng, Carlos Pérez, Stateless Consensus Research Team; Translation: @jinsecaijingxiaozou
Ethereum has grown from a small, experimental network into a key component of global infrastructure. It settles billions of dollars in value daily, coordinates thousands of applications, and underpins the entire Layer 2 (L2) network ecosystem.
All of this ultimately relies on a core underlying component: state.
1, What is "State" and its Importance
A user's balance is not stored in their wallet, but rather resides in Ethereum's state. State can be roughly understood as "everything Ethereum currently knows": accounts, contract storage (all data written to contracts), and bytecode (the logic that runs when using smart contracts).
State is fundamental to almost all functionality: Wallets rely on it to display balances and past transaction history; decentralized applications (DApps) query it to understand their current holdings, orders, or messages; and infrastructure (block explorers, cross-chain bridges, indexers, etc.) continuously reads state to provide services on top of it. If state becomes too large, too centralized, or unable to provide services, all of the above layers become more fragile, more costly, and more difficult to decentralize.
2、L1Scaling Brings Corresponding Consequences
Ethereum has been continuously working on network scaling for many years: through Layer 2 networks, EIP-4844, increasing the gas cap, gas fee repricing, and the built-in proposer-builder separation mechanism. Each step has improved network processing capacity, but it has also brought new challenges.
Challenge 1: Continuously Expanding State
Ethereum's state size only increases. Every new account, storage operation, and bytecode write permanently increases the amount of data the network must store.
This incurs specific costs:
Validators and full nodes must store more data. As the state size increases, the database needs to handle additional workload, and efficiency decreases accordingly.
RPC service providers need to maintain complete state accessibility, ensuring that any account or stored data can be queried at any time. State growth leads to slower node synchronization and decreased stability. Increasing the gas cap exacerbates state bloat because each block can accommodate more write operations. This issue has already occurred on other public chains. As the state size increases, it becomes difficult for ordinary users to run full nodes, resulting in state data being concentrated in the hands of a few large service providers. In Ethereum, most blocks are already produced by professional builders. A core concern is how many independent entities can still complete end-to-end block building at critical moments. If only a very small number of participants can store and provide the complete state, censorship resistance and trust neutrality will be compromised—because fewer entities will be able to build blocks containing censored transactions. Part of the positive factor is that mechanisms such as FOCIL and VOPS are designed to ensure censorship resistance within a professional builder ecosystem. However, its effectiveness still depends on a healthy node ecosystem, where nodes can access, store, and provide state data at an affordable cost. Therefore, controlling state growth is a necessary prerequisite, not an optional optimization. To determine the tipping point, we are actively conducting stress tests: When does state growth become a scalability bottleneck? When does the state size make it difficult for nodes to follow the chain head? When does the client implementation fail under extreme state sizes? Challenge Two: In a stateless architecture, who is responsible for storing and providing the state? Even if Ethereum permanently maintains its current gas cap, we will eventually encounter the state bloat problem. Meanwhile, the community clearly expects higher throughput. Stateless solutions eliminate a major limitation: validators do not need to hold the complete state to verify blocks, only proofs. This is a significant scalability breakthrough, meeting the community's demand for higher throughput and revealing a previously implicit fact—state storage can evolve into an independent and more specialized function, rather than being tied to each validator. At that time, most state may be stored only by: block builders; RPC service providers; and other specialized operators (such as MEV searchers and block explorers). In other words, state will become more centralized. This will lead to multiple consequences: Increased synchronization difficulty: Centralized service providers may begin to restrict access to state, making it difficult for new service providers to launch; Weakened censorship resistance: If the censored state data cannot be obtained, censorship resistance mechanisms such as FOCIL may fail; System resilience risk: If only a few entities store and provide complete state, service interruptions or external pressure will quickly cut off access to most components of the ecosystem. Even if many entities store state, there is a lack of effective ways to verify that they are actually providing services, and existing incentives are insufficient. Snapshot synchronization is widely supported by default, but this is not the case for RPC services. If the cost of state services is not reduced and their general appeal is not increased, the network's ability to access its own state will be limited to a few service providers. This problem also affects Layer 2 networks. The ability of users to force transaction packaging depends on reliable access to the state of the Rollup contract on L1. If L1 state access becomes fragile or highly centralized, these safety valve mechanisms will be difficult to operate in practice.
3、The three major directions we see
(1) State validity period
Not all state data has equal permanent importance. Our recent analysis shows that about 80% of state data has not been accessed for more than a year. However, nodes still have to permanently bear the cost of storing these states.
The core idea of the state validity period mechanism is to temporarily remove inactive states from the "active set" and restore them through some form of proof when needed.
In summary, they can be divided into two main categories: The first category: Marking, Invalidation, and Resurrection. Instead of treating all states as permanently active, the protocol marks rarely used states as inactive, removing them from the active set maintained by each node, while allowing them to be restored in the future through proof of their historical existence. The practical effect is that frequently used contracts and balances remain active with low access costs, while long-forgotten states do not need to be continuously maintained by each node and can still be recalled when needed. The second category: Multi-period invalidation mechanism. In a multi-period design, we do not set invalidation for individual entries, but rather periodically divide states into periods (e.g., one period = one year). The current period is smaller and fully active, while older periods are frozen from a real-time execution perspective, and new states are written into the current period. Old states can only be restored by proving their existence in previous periods. The mark-invalidate-revive mechanism is usually more refined and the revive process is more direct, but the marking process requires storing additional metadata. Multi-cycle invalidation is conceptually simpler and more naturally integrated with archiving mechanisms, but revive proofs are often more complex and voluminous. Ultimately, both types of solutions share the same goal—to keep the active state streamlined by temporarily removing inactive parts while providing a revive path—but they make different trade-offs in terms of complexity, user experience, and workload allocation to clients and infrastructure. (2) State Archiving State archiving distinguishes states into cold and hot states. Hot states are the parts of the network that need to be accessed frequently; cold states are the parts that are still important for historical records and verifiability but are rarely touched. In the state archiving design, nodes explicitly store frequently used hot states and historical data separately. Even as the overall state continues to grow, the portion that needs to be accessed quickly (hot data sets) can still remain at a limited size. In effect, this means that the node's performance—especially the I/O cost of accessing the state—can remain basically stable over time, without declining with chain age. (3) Lowering the barrier to entry for state storage and services. An obvious question is: Can we achieve our goal while holding less data? In other words, can we design nodes and wallets that do not need to permanently store the complete state and can still be effective participants? One promising direction is partially stateless solutions: Nodes only store and provide partial state (such as data related to a specific user or application); Wallets and light clients take a more proactive role in storing and caching the necessary fragments of state, rather than relying entirely on a few large RPC service providers. If storage can be securely distributed across wallets and niche nodes, the burden on individual operators will be reduced, and the state holder community will become more diverse. Another direction is lowering the barrier to entry for running useful infrastructure: Simplifying the process of deploying RPC nodes that only serve partial state; Designing protocols and tools that enable wallets and applications to discover and integrate multiple partial data sources, rather than relying on a single complete RPC endpoint.
4, Future Direction
Ethereum's state is quietly becoming key to several core issues for the protocol's future:
To what extent will the size of the state become a barrier to participation?
When validators can securely verify blocks without state, who will store the state?
Who will provide state services to users? What will be the incentive?
Some questions remain unresolved, but the direction is clear: reduce the constraints of state on performance, reduce storage costs, and improve service accessibility.
Our current focus is on advancing low-risk, high-reward initiatives: **Archiving Schemes** We are exploring off-protocol solutions to control the scale of active state while relying on archiving schemes to store historical data. This will provide real-world data on performance, user experience, and operational complexity. If proven effective, it can be pushed forward as an in-protocol upgrade if necessary. **Semi-Stateless Nodes and RPC Enhancements** Most users and applications interact with Ethereum through centralized RPC service providers. We are working on the following improvements: **Reducing the difficulty and cost of running nodes, even if nodes do not store all state;** **Allowing multiple nodes to collaboratively provide complete state services;** **Increasing the diversity of RPC service providers to avoid single points of failure.** These projects were carefully selected for their combination of immediate practicality and forward-looking compatibility: they can both improve Ethereum's current health and lay the foundation for deeper protocol upgrades in the future.