Source: PermaDAO
The Arweave network employs a novel storage donation mechanism to ensure the permanence of the information it stores. In this article, we will discuss in detail how storage donation works, and then study its characteristics and risk profile by simulating its execution using Markov chains.
Let's dive in!
Background: What is an endowment
In the 2019 draft of the Arweave Yellow Paper, we described Arweave's endowment architecture (see Section 3.2.2), and the core logic of Arweave endowment is as follows:
The cost of providing storage has been falling at a strong exponential rate since the advent of information encoding technology. From papyrus to Gutenberg printing, to magnetic drum storage, floppy disks, and flash drives, the cost of encoding and retrieving information has continued to fall for thousands of years. In the digital age, we call this the Kryder rate.
While the exact rate at which costs fall varies, this pattern is robust and has a lot of room to grow: the theoretical data density limit alone is 10^51 higher than our current achievements. Furthermore, we do not expect the need to store data more efficiently to slow down, as humans and machines will always become more efficient if they can access and process more information.
Given these factors, we find that we can price perpetual storage with a single fee by extrapolating an extremely conservative Kryder rate. We do this by charging users a base fee equivalent to 200 years of storage at current costs, and then increasing the storage purchasing power of this donation as storage costs fall. As long as the Kryder rate remains above 0.5%, the storage purchasing power of the donation at the end of the year will be greater than at the beginning of the year.
Once the protocol approaches the end of its lifecycle, the size and cost of the dataset will drop to extremely low levels. Due to its small size, we expect it to be altruistically "imported" into the next permanent information storage system to continue replicating the data. This mirrors the pattern in which Gopher archives appeared on the modern web.
Defining Kryder+ Rates
In practice, the Arweave network uses a modified version of the original Kryder rate, which we will refer to in this article as the “Kryder+” rate. The Kryder+ rate includes not only the raw data storage, but also other factors required to keep a network like Arweave online: replication, power, and operating costs. We note that these factors are all affected by the falling cost of storage:
Power:Changes in data density and reliability (the factors that have the greatest impact on the Kryder rate) are rarely, if ever, accompanied by increases in electricity usage. Thus, as the capacity of storage media increases, the relative energy consumption of storing a given amount of data also decreases.
Operational costs:Similar to power consumption, as the efficiency of individual digital storage media increases, the number of devices required to store data (and the operational overhead of maintaining those devices) also decreases.
In the current version of Arweave (2.5.3), the Kryder+ rate targets a target number of 45 replicas for a dataset, along with a 2x storage fee for operational and energy expenses.
After upgrading to Arweave 2.6, the network will automatically derive the Kryder+ rate based on the price that miners are willing to offer storage. Since miners have an incentive to minimize the price as they compete with each other, the network can organize a trustless oracle to determine this price.
It is worth noting that Arweave’s formulation of the Kryder+ rate formula does not include bandwidth costs. Arweave uses a separate set of reputation-based incentives to address this problem.
Simulated Donations
Now that we have covered the theoretical background of Arweave donations, and their practical application in real networks, we can now consider simulating this mechanism to observe possible outcomes in the real world. To help achieve this goal, we use a Markov chain-based simulation technique. The model iterates through potential futures individually multiple times, year by year, and then collates the results.
Simulated Factors
The Kryder+ rate is the main factor in the Arweave donation simulation. In this model, we base it on a dataset of hard drive costs that vary over time. From this data, we observe an average Kryder rate of about 38%. In addition to the real-world data, we also add a layer of "pessimism" about future vs. past progress in order to stress test how donations work in less fortunate times. We describe this "pessimism" factor as the percentage of the previous drop in storage costs that we expect to continue into the future. For example, a 10% pessimism rate means that we believe that future reductions in storage costs will only be 10% as effective as they were in the past.
Another important factor when simulating Arweave donations is the volatility of its token price. Arweave uses a floating-price token for its donations for two main reasons:
However, one impact of the floating token price is that the "fiat value" of donations is unstable. To model this in the simulation, we assume that fluctuations in the value of donations are pessimistic and price neutral. That is, fluctuations in the value of simulated donations should average out to zero in the aggregate, but individually cause prices to fluctuate up and down over the period.
To allow each individual simulation to terminate in a reasonable amount of time, the simulation will stop executing after 10,000 years or when the donation value reaches zero.
Donation Lifetime
The easiest way to understand donation behavior is to look at the average number of years a donation survives under different external conditions.
In the figure above, we see a graph of the donation lifetime, where the horizontal axis represents different levels of annual maximum token price volatility, and the vertical axis represents the change in the effective Kryder+ rate (while listing their "pessimistic" values relative to the actual data). The first important cell to note in this performance graph is at 0% volatility and 0% pessimism. A pessimism/Kryder+ rate of 0% means that we assume that storage costs will never decrease again. In this case, the network should be able to store user data for at least 200 years while economically functioning. This parameter was chosen to ensure that even those who are deeply skeptical of future technological advances can be confident that their data can be economically viable to store for at least about 3 generations before altruistic storage is required. Another important observation from this graph is in the region of 30% volatility and 2/4% Kryder+ rate. In our simulation, a maximum token price volatility of 30% means that the token price changes by an average of 15% per year, which is very close to the average volatility of the S&P 500 index of 14.4% per year between 1950 and 2015. Assuming this average volatility of the network token price, we can see that a Kryder+ rate of just about 2% can produce an endowment life of nearly 2,000 years, while a slightly higher rate can produce an endowment life of over 10,000 years.
Furthermore, assuming an eventual average volatility similar to commodities (estimated at around 2-5% by the World Bank), we see that even a Kryder+ rate below 0.76% can result in an endowment life of over 10,000 years.
Deflation Probability
As shown above, in many scenarios, the endowment still has tokens to continue to incentivize data storage even after the simulation terminates after 10,000 years. If we look deeper into the execution of each individual run, we see that the majority of tokens were taken from donations in the early years of storage:
Given this behavior, we can notice that when users deposit tokens into donations to back the data they store, there is a high probability that some tokens will never be released again.
In the above graph, we see the number of tokens that will likely never be released from donations for varying degrees of pessimism about future storage cost reductions.