Author: Daniel Barabander, General Counsel and Partner of Variant Fund; Translator: 0xjs@黄金财经
Key Points
Currently, basic AI development is dominated by a few technology companies and is closed and anti-competitive.
Open source software development is another option, but basic AI cannot be developed as a traditional open source software project (such as Linux) because it has a "resource problem" and open source contributors are also required to donate computing and data costs beyond their personal capabilities.
Crypto solves the resource problem by incentivizing resource providers to contribute to basic open source AI projects through ownership.
Open source AI, combined with encryption, can support larger models and drive more innovation, leading to better AI.
Introduction
A 2024 Pew Research Center poll found that 64% of Americans believe that social media has had a negative rather than positive impact on the United States, 78% say social media companies have too much power and influence in politics today, and 83% say it is very or very likely that these platforms intentionally censor political viewpoints with which they disagree. Dislike for social media platforms is one of the few issues that unites Americans.
Looking back at how the social media experiment has progressed over the past 20 years, it seems inevitable that we would end up where we are. You all know the story. A handful of large tech companies initially attracted attention and, most importantly, user data. While it was initially hoped that this data would be made public, these companies quickly reversed course and shut down access after using it to build unbreakable network effects. This essentially led to the current situation, where less than ten large tech social media companies exist like small feudal fiefdoms under an oligopoly with no incentive to change because the status quo is extremely profitable. It is closed and anti-competitive.
Looking at the current state of AI experimentation, I feel like I’m watching the same movie over and over again, but this time with more involvement. A handful of large tech companies have amassed the GPUs and data to build foundational AI models, and have locked down access to them. It’s already impossible for new entrants (without raising billions of dollars) to build competing versions because the barriers to entry are too high — the computational capital expenditures to pre-train a foundational model alone are in the billions, and social media companies that benefited from the last tech boom are using their control over proprietary user data to build models that competitors can’t. We are on a full-throttle path to recreate in AI what we did in social media: closed and anti-competitive. If we continue down this path of closed AI, a handful of tech companies will have unfettered control over access to information and opportunity.
Open Source AI and the “Resource Problem”
If we don’t want a closed AI world, what is our alternative? The obvious answer is to build foundational models as open source software projects. We have countless examples of open source projects that build foundational software we rely on every day. If Linux showed that something as basic as an operating system could be built open source, then why is LLM different?
Unfortunately, there are limitations to foundational AI models that make them different from traditional software, which severely hamper their viability as traditional open source software projects. Specifically, foundational AI models inherently require computational and data resources beyond the capabilities of any individual. As a result, unlike traditional open source software projects that rely on people donating their time (which is already a challenging problem), open source AI also requires people to donate resources in the form of computation and data. This is the "resource problem" of open source AI.
To better understand the resource problem, let's look at Meta's LLaMa model. Meta differs from its competitors (OpenAI, Google, etc.) in that it does not hide its model behind a paid API, but rather publicly makes LLaMa's weights available for anyone to use for free (with some restrictions). These weights represent what the model has learned from Meta's training process and are required to run the model. With the weights, anyone can fine-tune the model or use the model's output as input to a new model.
While Meta deserves credit for publishing the weights of LLaMa, it is not a true open source software project. Meta trains the model privately using its own computation, data, and decisions, and unilaterally decides when to open the model to the world. Meta does not invite independent researchers/developers to participate in the community because a single community member cannot afford the computational or data resources required to train or retrain the model - tens of thousands of high-memory GPUs, the data centers that house them, a large amount of cooling infrastructure, and trillions of training data tokens. As stated in the Stanford University 2024 AI Index report, "the rising cost of training has effectively excluded universities (traditionally the center of AI research) from developing their own cutting-edge foundational models." To put the cost into context, Sam Altman mentioned that GPT-4 cost $100 million to train, and that probably doesn't include capital expenditures; Meta's capital expenditures increased by $2.1 billion year-on-year (Q2 2024 vs. Q2 2023), mainly from investments in servers, data centers, and network infrastructure related to training AI models. So while LLaMa’s community contributors may have the technical ability to contribute and iterate on the basic model architecture, they still lack the means to do so.
In summary, unlike traditional open source software projects, where contributors are asked only to contribute their time, contributors to open source AI projects are asked to contribute both time and significant costs in the form of compute and data. It is unrealistic to rely on goodwill and volunteerism to incentivize enough parties to provide these resources. They need further incentives. Perhaps the best counterexample to the merits of goodwill and volunteerism in developing open source AI is the success of the 176B parameter open source LLM BLOOM, which involved 1,000 volunteer researchers from over 70 countries and over 250 institutions. While this is undoubtedly an impressive achievement (one that I fully support), it took a year to coordinate a single training run and €3 million in funding from French research institutions (and that cost does not include the capital expenditure on a supercomputer to train the model, which one of the French institutions already had access to). The process of coordinating and relying on new grants to iterate on BLOOM was too cumbersome and bureaucratic to keep pace with the pace of large tech labs. While it’s been more than two years since BLOOM was released, I don’t know if the collective has produced any follow-up models. To make open source AI possible, we need to incentivize resource providers to contribute their compute and data without the open source contributors incurring costs. Why Crypto Can Solve Open Source AI’s Resource Problem Crypto’s breakthrough is to leverage ownership to make resource-costly open source software projects possible. Crypto solves the resource problem inherent in open source AI by incentivizing speculative resource providers with potential upside to the network, rather than requiring open source contributors to pay costs upfront to provide those resources. For proof of this, look no further than the original crypto project, Bitcoin. Bitcoin is an open source software project; the code that runs it is completely open and has been since the day the project began. But the code itself is not a secret sauce; downloading and running the Bitcoin node software to create a blockchain that only exists on your local computer doesn’t do much good. The software is only useful if the number of computations required to mine blocks exceeds the computational power of any single contributor. Only then can the added value of the software be realized: maintaining a ledger that no one controls. Like Foundation Open Source AI, Bitcoin represents an open source software project that requires resources beyond the capabilities of any single contributor. They may need this computation for different reasons — Bitcoin is to make the network tamper-proof, while Foundation AI is to iterate on the model — but the broader point is that they both require resources beyond the needs of any single contributor to function as viable open source software projects.
The magic trick that Bitcoin, or any crypto network, uses to incentivize participants to provide resources to an open source software project is to provide network ownership in the form of tokens. As Jesse wrote in his founding paper for Variant back in 2020, ownership incentivizes resource providers to contribute resources to a project in exchange for potential upside to the network. This is similar to how sweat equity can be used to launch a fledgling company — by paying early employees (e.g., founders) primarily through ownership of the business, a startup can overcome the launch problem by gaining access to a workforce that it would otherwise not be able to afford. Crypto extends the concept of sweat equity to resource providers, not just those who donate their time. As a result, Variant focuses on investing in projects that leverage ownership to build network effects, such as Uniswap, Morpho, and World.
If we want to make open source AI possible, then enabling ownership through crypto is the solution to the resource problem it faces. Researchers can freely contribute their model design ideas to open source projects because the resources needed to implement their ideas will be provided by compute and data providers in exchange for their ownership of the project, rather than requiring these researchers to pay high upfront costs. Ownership can take many different forms in open source AI, but I’m most excited about ownership of the models themselves, like the approach proposed by Pluralis.
Pluralis calls this approach the protocol model, where compute providers can contribute compute resources to train a specific open source model and receive ownership of the model’s future inference revenue. Because ownership belongs to a specific model, and the value of ownership is based on inference revenue, compute providers have an incentive to select the best models and not cheat on training (because providing useless training reduces the expected value of future inference revenue). The question then becomes: how is ownership enforced on Pluralis if weights need to be sent to compute providers for training? The answer is that model parallelism is used to distribute shards of models among workers, allowing for exploitation of a key property of neural networks: one can contribute to training a larger model while only seeing a small fraction of the total weights, ensuring that the full set of weights remains unextractable. And because many different models are trained on Pluralis, trainers will have many different sets of weights, making it extremely difficult to recreate a model. This is the core concept of protocol models: they are trainable and can be used, but cannot be extracted from the protocol (without using more compute power than would be required to train the model from scratch). This addresses a concern often raised by critics of open source AI, that closed AI competitors will appropriate the fruits of the open project’s labor.
Why Crypto + Open Source = Better AI
I described the problem of Big Tech control at the beginning of this post to make the case for why closed AI is bad from a normative perspective. But in a world where our online experiences are tinged with fatalism, I worry that this may not make sense to most readers. So in closing, I want to give two reasons why open source AI powered by crypto will actually lead to better AI.
First, the combination of Crypto and open source AI will allow us to reach the next layer of base models because it will coordinate more resources than closed AI. Our current research shows that more resources in the form of compute and data means better models, which is why base models generally get bigger and bigger. Bitcoin shows us what open source software plus crypto can unlock in terms of computing power. It is the largest and most powerful computing network in the world, orders of magnitude larger than the big tech companies’ clouds. Crypto turns isolated competition into cooperative competition. Resource providers are incentivized to contribute their resources to solve a collective problem, rather than hoarding their resources to solve that problem alone (and redundantly). Open source AI using crypto will be able to leverage the world’s collective compute and data to build model sizes far beyond what is possible with closed AI. Companies like Hyperbolic have already demonstrated the power of leveraging collective computing resources, where anyone can rent out GPUs on their open marketplace at a lower price.
Second, combining Crypto and open source AI will drive more innovation. This is because, if we can overcome the resource problem, we can return to the highly iterative and innovative open source nature of machine learning research. Prior to the recent launch of foundational LLMs, machine learning researchers had been publicly releasing their models and blueprints for replicating models for decades. These models typically used more limited open datasets and had manageable compute requirements, meaning anyone could iterate on them. It was through this iteration that we made progress in sequence modeling, such as RNNs, LSTMs, and attention mechanisms, which made possible the “Transformer” model architecture that the current foundational LLMs rely on. But this all changed with the launch of GPT-3 (which bucked the trend of GPT-2 being open source) and the huge success of ChatGPT. This is because OpenAI proved that if you throw enough compute and data at massive models, you can build LLMs that appear to understand human language. This created resource problems that led to high prices that academia could not afford, and caused large tech company labs to largely stop publicly releasing their model architectures to stay ahead of the competition. The current state of relying primarily on individual labs will limit our ability to push the boundaries of the state of the art. Open-sourcing AI enabled by cryptography will mean that researchers will once again be able to continue this iterative process on cutting-edge models to discover the “next transformer.”