Author: Chakra; Translation: 0xjs@黄金财经
Bitcoin is the world's oldest, most secure, most decentralized, and most valuable blockchain. However, its low transactions per second (TPS) and limited programming capabilities are often criticized, making it difficult to support large-scale applications, which seriously hinders the development of the Bitcoin ecosystem. As a builder of the Bitcoin ecosystem, this article will take you through the past, present, and future of Bitcoin's scaling solutions.
This article is the first in a series of articles on Bitcoin scalability, focusing on the native scaling solutions historically implemented on the Bitcoin mainnet. The next article will discuss off-chain scaling solutions with higher scalability. Stay tuned.
Increasing the block size limit
In 2010, Satoshi Nakamoto introduced a 1MB block size limit in bitcoin-core. This explicit limit has not been modified for more than a decade.
Interestingly, Satoshi did not publicly explain why he proposed the block size limit. This limit was "hidden" in the PR of the code merge without a detailed explanation. A few years after Satoshi left, the community had serious disagreements on the block size limit, and the demand for larger blocks triggered widespread discussion.
The larger the block, the more transactions it can accommodate. Assuming that the consensus time remains unchanged, the larger the block, the higher the TPS.
Why is TPS so important? Because under the 1MB block limit, at the transaction scale at the time, the number of transactions that can be completed per second can only be 3-7, which is far from enough for large-scale applications and cannot realize Bitcoin's vision of a "peer-to-peer electronic cash system."
However, larger blocks also bring different degrees of problems.
First, larger blocks have higher requirements for hardware such as storage, computing and bandwidth, resulting in increased operating costs for full nodes. Bitcoin's historical transaction data has expanded rapidly, requiring new full nodes to spend more time synchronizing with the network. These requirements reduce users' willingness to operate full nodes, thereby reducing the degree of decentralization.
Secondly, the larger the block, the longer the synchronization time between nodes, and the greater the possibility of orphan blocks, resulting in more frequent block reorganization, increased risk of forks, and greatly reduced security.
Later, this problem was called the blockchain impossible triangle by Vitalik, that is, the blockchain cannot achieve decentralization, scalability and security at the same time. The larger the block, the stronger the scalability, but at the cost of weaker decentralization and security.
Most importantly, modifying the block size limit requires a hard fork, which requires all nodes in the entire network to upgrade at the same time, otherwise it will cause the network to split. This is not a good option for Bitcoin, which relies on decentralized consensus. Under the influence of Satoshi Nakamoto, avoiding hard forks seems to have become a de facto principle of Bitcoin.
Unfortunately, the split did happen. Despite the lack of consensus within the community, some miners and developers changed the block size limit in the client, which eventually led to a network fork. In 2016, Bitcoin Classic adopted BIP 109 to fork the block size limit to 2MB; in 2016, the Bitcoin XT client adopted BIP 101 to increase the block size to 8MB. However, the vast majority of miners and users remain on what we now know as the Bitcoin mainnet.
Efforts to explicitly increase the block size through a hard fork have failed.
Segregated Witness
If a hard fork is unacceptable, can a soft fork be a solution? SegWit is one of the ways.
The witness is the certificate that unlocks the UTXO. For a long time, the witness has been placed in the input script field of the UTXO to complete the transaction. However, this approach has potential problems such as circular dependency, third-party transaction malleability, and second-party transaction malleability.
As early as 2011, developers noticed this problem and proposed a solution called Segregated Witness (SegWit), which separates the witness from other transaction data. But the hard fork proposal at that time did not gain support, and it was not until the SegWit soft fork was proposed in 2015 that it was finally merged.
How does SegWit achieve backward compatibility through soft forks? This mainly includes the following two aspects:
The new version nodes can recognize and accept the blocks and transactions generated by the old version nodes.
Although the old version nodes cannot recognize the new rules and features introduced by the new version, they still regard the blocks generated by the new version as valid.
The SegWit soft fork allows new transactions to use empty input scripts and adds a witness field to the block structure to store witnesses. Since the Bitcoin Core before the upgrade supports empty input scripts, the old version nodes will not reject the blocks generated by the new version. In addition, by using the version field, the old transaction types can still be used, and the nodes will process them differently depending on the version.
The expansion in SegWit is achieved in the form of weight, with the witness byte having a weight of 1 and the other data bytes having a weight of 4, thus limiting the maximum weight of each block to 4 million. Why assign different weights to different types of data? A common sense idea is that witness data only serves as verification when used and does not need to be stored in storage for a long time, so the cost is relatively low and the weight is also low.
This is actually a disguised increase in the block size limit. The theoretical upper limit of the block size has been raised to 4MB (completely due to the witness data), and the average block can reach about 2MB. From the old block structure, this still adheres to the original limit of 1MB per block set by Satoshi Nakamoto.
Taproot
Using Bitcoin's opcodes such as OP_IF, we can set complex conditions for Bitcoin's spending scripts, such as time locks, multi-signatures, etc. However, complex spending conditions often require multiple inputs and signatures for verification, thereby increasing the block load and reducing transaction speed, while exposing all payment conditions and causing privacy leaks.
Taproot uses MAST to enhance Bitcoin, and users use Merkle Trie to express spending conditions. Each leaf node represents a spending script. During the spending process, only the actual executed script and the corresponding Merkle Path need to be provided without revealing other conditions. This can reduce block space consumption and improve privacy.
The Taproot upgrade also introduces Schnorr signatures, which have additive homomorphic properties, allowing signature aggregation and batch verification, thereby increasing the overall number of transactions per second (TPS). The aggregate signature advantage of Schnorr signatures greatly simplifies the logic of verifying multi-signature transactions. Previously, ECDSA signatures required multiple signatures to be sent to the chain to match the script, while Schnorr signatures only require a single off-chain aggregate signature to be sent to the chain, thus reducing the on-chain space usage of multi-signature payments.
By combining Schnorr signatures with MAST and using the concept of Pay to Contract (P2C), complex contract code is submitted through the MAST root to adjust and generate a standard Bitcoin public key that supports a single Schnorr signature payment.
Interestingly, since the single and multiple signatures of Schnorr signatures look the same on the chain, the logic of complex scripts, multi-signatures, and single signatures cannot be distinguished on the chain, further enhancing privacy.
Conclusion
Bitcoin’s scalability solutions reflect its evolving approach to improving performance while maintaining decentralization and security.
Initially, increasing the block size was considered to directly address the problem of low transaction rates, but raised issues related to node costs and network forks, posing challenges to community consensus.
The introduction of SegWit marks a major step forward, optimizing block capacity through a soft fork, ensuring backward compatibility and avoiding a divisive hard fork.
Taproot subsequently further improved scalability and privacy through MAST and Schnorr signatures, reducing transaction space and improving verification efficiency. More importantly, Taproot can implement complex script programming on Bitcoin, paving the way for future expansion attempts.
These developments highlight Bitcoin's cautious and innovative move toward a more scalable and powerful network, which is crucial to its future as a global payment system.
However, the impact of these expansion solutions is not enough to realize the vision of a "peer-to-peer electronic cash system."