Emergence: when many small individuals interact with each other A large whole, and this whole exhibits new characteristics that are not possessed by the individuals that make up it. For example, the life phenomenon studied in biology is an emergent characteristic of chemistry.
Hallucination: The model has a tendency to output deceptive data. The output of the AI model looks correct but is actually wrong.
The link between AI and Crypto shows obvious band fluctuation characteristics. After AlphaGo defeated human Go professional players in 2016, the encryption world spontaneously gave birth to Fetch. AI and other attempts to combine the two, since the emergence of GPT-4 in 2023, this AI + Crypto craze has revived, represented by the issuance of WorldCoin, and mankind seems to be entering a utopia where AI is responsible for productivity and Crypto is responsible for distribution. era.
This kind of emotion reached its climax after OpenAI launched the Wensheng video application Sora, but since it is an emotion, there is always an irrational element. At least Li Yizhou is among those who were accidentally injured, for example p>
The specific application of AI and algorithm development are always confused. The Transformer principles behind Sora and GPT-4 are open source, but you have to pay OpenAI to use them. ;
The combination of AI and Crypto is still an initiative of Crypto, and the AI giants have not yet had a clear intention. At this stage, AI can do more for Crypto than Crypto can do for AI.
The use of AI technology in Crypto applications ≠ the integration of AI and Crypto, such as digital humans in chain games/GameFi/Metaverse/Web3 Game/AW;
p>
What Crypto can do for the development of AI technology is mainly to strengthen the three essentials of AI: computing power, decentralization of data and models, and token incentives;
p>
WorldCoin is a successful practice of combining the two, zkML is at the technical intersection of AI and Crypto, and UBI theory (human basic income) has been put into practice for the first time on a large scale.
In this article, I will focus on how Crypto can benefit AI. The current Crypto projects that focus on AI applications are mainly gimmicks and cannot be included in the discussion.
From Linear Regression to Transformer
For a long time, the focus of AI topics has been whether the "emergence" of artificial intelligence will create the mechanical intelligence or silicon-based robots in "The Matrix" Civilization, such concerns have always existed in the relationship between humans and AI technology, most recently after the advent of Sora, and earlier also with GPT-4 (2023), AlphaGo (2016) and IBM's Deep Blue defeating chess in 1997 .
It is also true that such worries have never come true. It is better to relax and briefly sort out the mechanism of AI.
We start from linear regression, which is actually a linear equation of one variable. For example, Jia Ling’s weight loss mechanism can be summarized as follows. x and y represent the relationship between energy intake and weight respectively, that is, the more you eat, Naturally, you get fatter. If you want to lose weight, you have to eat less.
However, this will bring about some problems. First, there are physiological limits for human height and weight. 3-meter giants and thousand-pound ladies are not easy to appear, so it is meaningless to consider situations beyond the limits; second Second, simply eating less and exercising more does not comply with the scientific principles of weight loss, and in serious cases will damage the body.
We introduce BMI (Body Mass Index), which is weight divided by height squared to measure the reasonable relationship between the two, and measure the relationship between height and weight through three factors: eating, sleeping, and exercise. Relationship, so we need three parameters and two outputs. Obviously linear regression is not enough, so the neural network was born. As the name suggests, the neural network imitates the structure of the human brain. The more you think, the more reasonable it may be. Think twice before you act. , increase the number of times you think deeply, that is, deep learning (I said it in a far-fetched way, as long as everyone understands the meaning)
However, the deepening of the layers is not endless. The ceiling still exists. When reaching a certain critical value, the effect may become worse. Therefore, we need to understand the relationship between the existing information in a more reasonable way. It becomes very important, for example, to deeply understand the more detailed relationship between height and weight, to find factors that have not been discovered before, or if Jia Ling finds a top coach, but is embarrassed to say that she wants to lose weight, then the coach needs to figure out Jia Ling. What does it mean?
In this scenario, Jia Ling and the coach form encoding and decoding opponents, passing back and forth The meaning represents the true meaning of both parties, but unlike the straightforward "I want to lose weight and give gifts to the coach", the true intentions of both parties are hidden by the "meaning".
We have noticed the fact that if the two parties go back and forth enough times, then the meaning of each "meaning" will be easier to guess, and the relationship between each meaning and Jia Ling and the coach will become more and more obvious. clear.
If this model is expanded, it is a large language model (LLM) in the popular sense. To be more precise, it is a large language model. It examines the contextual relationship between words and sentences. However, currently The large models have been expanded and can be involved in scenes such as images and videos.
In the AI spectrum, whether it is a simple linear regression or an extremely complex Transformer, it is a type of algorithm or model. In addition, there are two elements: computing power and data.
To put it simply, AI is throughput of data , a machine that performs calculations and derives results, but compared with physical objects such as robots, AI is more virtual. In terms of computing power, data and models, the current commercial operation process of Web2 is as follows:
The data is divided into public data, company-owned data and commercial data. It requires professional annotation and other pre-processing steps before it can be used. For example, Scale AI company provides it for current mainstream AI companies. Data preprocessing;
Computing power is divided into two modes: self-built and cloud computing power rental. Currently, NVIDIA is the only one in GPU hardware, and CUDA library Lao Huang has also been preparing for many years. At present, one company dominates the software and hardware ecosystem, followed by computing power leasing from cloud service vendors, such as Microsoft's Azure, Google Cloud and AWS, many of which provide one-stop computing power and model deployment functions;
Models can be divided into two categories: frameworks and algorithms. The model war has ended. Google's TensorFlow came first, Meta's PyTorch came first, but whether it is Google that proposed TransFomer or PyTorch, Meta is gradually lagging behind OpenAI in terms of commercialization, but its strength still cannot be underestimated; Transformer is currently the dominant algorithm, and various large models are mainly focused on data sources and details.
As mentioned before, AI has a wide range of application fields, such as The code corrections Vitalik mentioned have already been put into use. If you look at it from another perspective, what Crypto can do for AI is mainly concentrated in non-technical fields, such as decentralized data markets, decentralized computing platforms, etc. There are some practices in centralized LLM, but it should be noted that using AI to analyze Crypto code and running AI models on a large scale on the blockchain are not the same thing at all, and adding some Crypto factors to the AI model is hardly perfect. combine.
Crypro is still better at production and motivation. There is no need to use Crypto to forcefully change the production paradigm of AI. This is to express sorrow for giving new words and use a hammer to find nails. It is a reasonable choice for Crypto to integrate into AI workflow and AI empower Crypto. The following are the possible combination points I have summarized:
Decentralized data production, such as DePIN's data collection, and the openness of data on the chain, contain rich ore of transaction data, which can be used for financial analysis, security analysis and training data;
< /li>
Decentralized pre-processing platform, traditional pre-training has no insurmountable technical barriers, but behind the large European and American models is the high-intensity labor of third-world manual annotators;< /p>
Decentralized computing power platform, decentralized incentives and use of software and hardware resources such as personal bandwidth, GPU computing power;
zkML, traditional privacy methods such as data desensitization cannot perfectly solve the problem. zkML can hide data directionality and can also effectively evaluate the authenticity and effectiveness of open source and closed source models;
These four perspectives are the scenarios I can think of where Crypto can empower AI. AI is a universal tool. The fields and projects of AI For Crypto will not be described in detail. You can study it yourself.
It can be found that Crypto currently mainly plays a role in encryption, privacy protection and economic design. The only technical integration point is zkML. There are some attempts. Here you can open your imagination. If Solana TPS can really reach 10 in the future Ten thousand +, if Filecoin and Solana are combined perfectly, can an on-chain LLM environment be created, which can create a real on-chain AI and change the current unequal relationship between Crypto and AI and the status of the two?
Web3 joins AI workflow
Needless to say, Nvidia RTX 4090 graphics card is hard currency, which is currently difficult to obtain in a large Eastern country. But more seriously, individuals and small Companies and academic institutions have also encountered a graphics card crisis. After all, large commercial companies are the money players. If a third path can be opened besides self-purchase and cloud vendors, it will obviously have actual commercial value and break away from pure Hype, the reasonable logic should be "if Web3 is not used, the project cannot be maintained." This is the correct attitude of Web3 For AI.
Data source: Grass and DePIN Automotive Family Bucket strong>
Grass is launched by Wynd Network, which is an idle bandwidth sales market. Grass is an open network data acquisition and distribution channel. Different from simple data collection and sales, Grass has the ability to convert data into Cleaning and verification functions to avoid the increasingly closed network environment. Not only that, Grass hopes to directly connect to the AI model and provide it with directly usable data sets. AI data sets require professional processing, such as a large amount of manual fine-tuning. To meet the special needs of AI models.
To expand, Grass wants to solve the problem of data sales, and the DePIN field of Web3 can produce the data required by AI, mainly focusing on autonomous driving of cars. Traditionally, autonomous driving requires companies to accumulate data by themselves. , while projects such as DIMO and Hivemapper run directly on cars, collecting more and more car driving information and road data.
In the past, autonomous driving required vehicle identification technology and high-precision maps. Information such as high-precision maps has been accumulated by companies such as NavInfo over a long period of time, forming a de facto industry barrier. If latecomers With the help of Web3 data, there is an opportunity to overtake in corners.
Data preprocessing: liberating humans enslaved by AI
Artificial intelligence can be divided into two parts: manual labeling and intelligent algorithms. In the third world, such as Kenya and Regions such as the Philippines are responsible for manually labeling the lowest part of the isovalue curve, while European and American AI preprocessing companies take the bulk of the revenue and then sell it to AI R&D companies.
With the development of AI, more companies are eyeing this part of the business. Under competition, the unit price of data labeling is getting lower and lower. This part of the business is mainly about labeling data, similar to identifying verification codes. There is no technical threshold for work, and the price is even as low as RMB 0.01.
In this case, such as Public Web3 data annotation platforms such as AI also have actual commercial markets, linking AI companies and data annotation workers, and using incentive systems to replace the pure commercial low-price competition model. However, it should be noted that the annotation technology of mature companies such as Scale AI guarantees reliable quality, and How to control the quality of a decentralized data annotation platform and prohibiting hair-raising parties is absolutely necessary. In essence, this is a C2B2B enterprise service. The pure scale and quantity of data cannot convince enterprises.
Hardware freedom: Render Network and Bittensor
It should be noted that unlike Bitcoin mining machines, there is currently no dedicated Web3 AI hardware, and the existing computing power The computing platform is transformed from mature hardware with a Crypto incentive layer. In essence, it can be summarized as the DePIN field, but it is different from the data source project, so it is written here according to the AI workflow.
Render Network is an "old project" and is not completely prepared for AI. It was first dedicated to rendering work, just like the name of Render. It started operations in 2017. At that time, GPUs were not so crazy, but there were market opportunities. It has gradually emerged that the GPU graphics card market, especially high-end graphics cards, is monopolized by NVIDIA. High prices hinder the entry of rendering, AI and metaverse users. If a channel can be built on the demand side and supply side, then an economic model similar to shared bicycles There is a chance to be established.
And GPU resources do not require actual handover of hardware, only software resources can be allocated. What is more worth mentioning is that Render Network switched to the Solana ecosystem in 2023, abandoning Polygon, and Solana did not pick up. Time's decision to defect to GPU has been proven to be the right move. For GPU usage and allocation, high-speed network is a necessity.
If Render Network is an old project, then Bittensor is at its peak.
BitTensor is built on Polkadot. Its goal is to train AI models through economic incentives, and compete on whether each node can train the AI model to the smallest error or the highest efficiency. It is also more in line with the classic AI on-chain process. Crypto project, but the real training process still requires NVIDIA GPU and traditional platform, which is generally similar to competition platforms such as Kaggle.
zkML and UBI: The AB side of Worldcoin
Zero-knowledge machine learning (zkML) solves this problem by introducing zk technology into the AI model training process The problems of data leakage, privacy failure and model verification are easy to understand. The first two are easy to understand. ZK-encrypted data can still be trained, but personal or private data will no longer be leaked.
Model verification refers to the evaluation problem of certain closed-source models. With the support of zk technology, a certain target value can be set, then the closed-source model can prove its ability by verifying the results, and There is no need to disclose the calculation process.
Worldcoin is not only an early mainstream project that envisioned zkML, but also a supporter of UBI (Human Basic Income). In its vision, the productivity of AI in the future will far exceed the upper limit of human demand, so the real problem In order to fairly distribute the benefits of AI, the concept of UBI will be shared with global users through the $WLD token, so real-person biometric identification must be carried out to follow the principle of fairness.
Of course, the current zkML and UBI are still in the early experimental stage, but they are interesting enough that I will continue to pay attention.
Conclusion
The development of AI, represented by Transformer and LLM, will gradually fall into a bottleneck. Just like linear regression and neural networks, after all, it is impossible to increase model parameters without limit. Or if the amount of data continues to increase, the marginal returns will diminish.
AI may be the seed player for emerging wisdom, but now the problem of illusion is very serious. In fact, it can be seen that the current illusion that Crypto can change AI is a kind of confidence, and it is also a standard illusion. The addition of Crypto is difficult to technically solve the illusion problem, but it can at least change some of the status quo from the perspective of fairness and transparency.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin: “Attention Is All You Need", 2017; arXiv:1706.03762.
Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, Dario Amodei: "Scaling Laws for Neural Language Models", 2020; arXiv:2001.08361.
Hao Liu, Wilson Yan, Matei Zaharia, Pieter Abbeel: "World Model on Million-Length Video And Language With RingAttention", 2024; arXiv:2402.08268.
Max Roser (2022) - “The brief history of artificial intelligence: The world has changed fast – what might be next?” Published online at OurWorldInData.org. Retrieved from: 'https: //ourworldindata.org/brief-history-of-ai' [Online Resource]
An introduction to zero-knowledge machine learning (ZKML)
Understanding the Intersection of Crypto and AI
Grass is the Data Layer of AI
Bittensor: A Peer-to-Peer Intelligence Market
Preview
Gain a broader understanding of the crypto industry through informative reports, and engage in in-depth discussions with other like-minded authors and readers. You are welcome to join us in our growing Coinlive community:https://t.me/CoinliveSG