New production relations empower the era of artificial intelligence

2024/05/03 14:34

Author |: Frank-Zhang.eth, Twitter: @dvzhangtz

The author believes that artificial intelligence itself represents a new type of productivity and is the direction of human development; the combination of Web3 and A will make Web3 a new type of production relationship in the new era, and a way to organize future human society and avoid the absolute monopoly of AI giants.

As a person who has been working on the front line of Web3 primary investment for a long time and a former AI researcher, I think it is my responsibility to write a track mapping.

I.Objective of this article

In order to understand A more fully, we need to understand:

1. Some basic concepts of A, such as: what is machine learning, why do we need a large language model.

2. The steps of AI development, such as: data acquisition, model pre-training, model fine tune, model use; what are they doing.

3. Some emerging directions, such as: external knowledge base, federated learning, ZKML, FHEML, promptlearning, and ability neurons.

4. What are the projects corresponding to Web3 on the entire A chain?

5. Which links in the entire AI chain have greater value or are easy to produce big projects.

When describing these concepts, the author will try not to use formulas and definitions, but to describe them in a metaphorical way.

This article covers as many new terms as possible. The author hopes to leave an impression in the reader's mind. If you encounter them in the future, you can come back to check where they are in the knowledge structure.

II. Basic concepts

Part 1

Today, the web3+ai projects we are familiar with, their technology belongs to the idea of neural networks in machine learning in artificial intelligence.

The following paragraph mainly defines some basic concepts: artificial intelligence, machine learning, neural networks, training, loss function, gradient descent, reinforcement learning, and expert systems.

Part 2

Artificial Intelligence

Definition: Artificial intelligence is a new technical science that studies and develops theories, methods, technologies and application systems that can simulate, extend and expand human intelligence. The research goal of artificial intelligence is to enable intelligent machines to: listen, see, speak, think, learn and act.

My definition: The results given by the machine are the same as those given by people, and it is difficult to distinguish between true and false (Turing test)

Part 3

Expert System

If something has clear steps and knowledge required: Expert System

Part 4

If something is difficult to describe how to do it:

How does a neural network teach a machine a piece of knowledge? We can use an analogy as follows:

If you want to teach a puppy how to pee on a mat (classic case, no bad direction)——(If you want to teach a machine a piece of knowledge)

Method 1: If the dog urinates on the mat, reward it with a piece of meat, otherwise spank it

Method 2: If the dog urinates on the mat, reward it with a piece of meat, otherwise spank it; and the farther away from the mat, the harder the spanking (calculate the loss function)

Method 3: Every time the dog takes a step, a judgment is made:

If it walks towards the mat, reward it with a piece of meat, if it does not walk towards the mat, spank it

(calculate the loss function once for each training)

Method 4: Every time the dog takes a step, a judgment is made. If it walks towards the mat, a piece of meat is rewarded. If it does not walk towards the mat, it is spanked. A piece of meat is placed in the direction of the mat to attract the dog to walk towards the mat. (After each training, the loss function is calculated, and then the gradient descent is performed in the direction that can best reduce the loss function) Part 6 Why have neural networks made rapid progress in the past decade? Because in the past decade, humans have made rapid progress in computing power, data, and algorithms.

Computing power: Neural networks were actually proposed in the last century, but the hardware at that time took too long to run neural networks. But with the development of chip technology in this century, the computing power of computer chips has doubled at a rate of 18 months. There are even chips such as GPUs that are good at parallel computing, which makes the computing time of neural networks "acceptable".

Data: Social media and the Internet have accumulated a large amount of training data, and large companies also have related automation needs.

Model: With computing power and data, researchers have developed a series of more efficient and accurate models.

"Computing power", "data" and "model" are also called the three elements of artificial intelligence.

Part 7

Why is the Large Language Model (LLM) important?

Why should we pay attention: Today we are gathered here because everyone is curious about Al+ web3; and A is popular because of ChatGPT; ChatGPT belongs to the large language model.

Why do we need a large language model: As we said above, machine learning requires training data, but the cost of large-scale data annotation is too high; the large language model solves this problem in a clever way.

Part8

Bert——The first large language model

What if we don’t have training data? A sentence of human speech itself is a segment of annotation. We can use the cloze method to create data.

We can hollow out some words in a paragraph and let the transformer architecture (not important) model predict what words should be filled in these places (let the dog find the mat);

If the model predicts wrong, measure some loss functions and gradient descent (if the dog walks towards the mat, reward a piece of meat, if it does not walk towards the mat, spank it, and put a piece of meat in the direction of the mat to attract the dog to walk towards the mat)

In this way, all texts on the Internet can become training data. Such a training process is also called "pre-training", so the large language model is also called a pre-training model. Such a model can give him a sentence and let him guess word by word what word should be said next. This experience is the same as using chatgpt now.

My understanding of pre-training: Pre-training allows machines to learn common human knowledge from corpus and cultivate "sense of language".

Part 9

Subsequent development of large language models

After Bert proposed it, everyone found that this thing is really useful!

You only need to make the model bigger and have more training data, and the effect will get better and better. This is not a mindless rush.

Training data surge: Bert used all Wikipedia and book data for training. Later, the training data was expanded to the entire network's English data, and then expanded to the entire network and all languages

The number of model parameters increased rapidly

Three,Steps in AI development

Part 1

Pre-training data acquisition

(This step is generally only done by large companies/large research institutes) Pre-training generally requires a huge amount of data. It is necessary to crawl all kinds of web pages on the entire network, accumulate TB of data, and then pre-process

Part 2

Model secondary pre-training

(option) Pre-training allows the machine to learn common human knowledge from the corpus and cultivate a "sense of language", but if we want the model to have more knowledge in a certain field, we can take the corpus in this field and feed it into the model for secondary pre-training.

For example, Meituan, as a food delivery platform, needs a large model that knows more about food delivery. Therefore, Meituan used the Meituan Dianping business corpus for secondary pre-training and developed MT-Bert. The resulting model works better in related scenarios.

My understanding of secondary pre-training: Secondary pre-training makes the model an expert in a certain scenario

Part 3

Model fine tune training

(option) If the pre-trained model wants to become an expert in a certain task, such as an expert in sentiment classification, topic extraction, or speaking and reading comprehension, you can use the data on the task to fine tune the model.

But here we need to label the data. For example, if we need sentiment classification data, we need data similar to the following:

The key maker asked me: "Do you match it?" neutral

The strong Xiao Wang next door asked me: "Do you match it?" negative

My understanding of secondary pre-training: Fine tune makes the model an expert in a certain task

It should be noted that the training of the model requires a large amount of data transmission between graphics cards. Currently, one of the major projects of Al+ web3 is distributed computing power - people from all over the world contribute their idle machines to do something. But it is very, very difficult to use this computing power to do complete distributed pre-training; if you want to do distributed Fine tune training, you also need a very clever design. Because the time to transmit information between graphics cards will be higher than the time to calculate.

Part 4

It should be noted that the training of the model requires a large amount of data transmission between graphics cards. Currently, one of the major projects of Al+web3 is distributed computing power - people from all over the world contribute their idle machines to do something. But it is very difficult to use this computing power to do complete distributed pre-training; if you want to do distributed Fine tune training, you also need a very clever design. Because the time to transmit information between graphics cards will be higher than the time to calculate.

Part 5

Model use

Model use is also called model inference. This refers to the process of using the model once after training is completed.

Compared to training, model inference does not require graphics cards to transmit a large amount of data, so distributed inference is a relatively easy thing.

Fourth, The latest application of large models

Part 1

External knowledge base

Reason: We hope that the model knows a small amount of knowledge in our field, but we don’t want to spend a lot of money to train the model

Method: Pack a large amount of PDF data into a vector database and use it as background information as input

Case: Baidu Cloud One, Myshell

Promptlearning

Reason: We feel that the external knowledge base cannot meet our customization needs for the model, but we don’t want to bear the burden of parameter adjustment and training of the entire model

Method: Do not train the model, only use training data, to learn what kind of Prompt should be written

Case: Widely used today

Part 2

Federated Learning (FL)

Cause: When using the training model, we need to provide our own data, which will leak our privacy, which is unacceptable for some financial and medical institutions

Method: Each institution uses data to train the model locally, and then concentrates the model in one place for model fusion

Case: Flock

FHEML

Cause: Federated learning requires each participant to train a model locally, but this threshold is too high for each participant

Method: Use FHE is used for fully homomorphic encryption, so the model can be trained directly with encrypted data

Disadvantages: very slow and expensive

Examples: ZAMA, Privasea

Part 3

ZKML

Cause: When we use model services provided by others, we hope to confirm that they are really providing model services according to our requirements, rather than using a small model and messing around

Method: Let it use ZK to generate a proof to prove that it is indeed doing the calculation it claims to have done

Disadvantages: very slow and expensive

Example: Modulus

Skillneuron

left;">Cause: Today's model is like a black box. We feed it a lot of training data, but we don't know what it has learned. We hope to have some way to optimize the model in a specific direction, such as having stronger emotional perception and higher moral standards.

Method: The model is like the brain. Some areas of the neurons manage emotions, and some areas manage morality. By finding these nodes, we can optimize them in a targeted manner.

Case: Future direction

V.Classification of Web3 projects on the A chain

Part 1

The author will divide it into three categories:

Infra: Decentralized A's infrastructure

Middleware: Let Infra better serve the application layer

Part 2

Infra layer: AI infrastructure will always be divided into three categories: data computing power algorithm (model)

Decentralized algorithm (model):

@TheBittensorHub Research report: x.com/dvzhangtz/stat..@flock_ io

Decentralized computing power:

General computing power: @akashnet_, @ionet

Specialized computing power: @rendernetwork(rendering), @gensynai(AI), @heuris_ai(Al)@exa_bits (A)(AD,

Decentralized data:

Data Annotation: @PublciAl_, QuestLab

Storage: IPFS, FIL

Oracle: Chainlink

Index: The Graph

Part 3

Middleware: How to make Infra better serve the application layer

Privacy: @zama fhe, @Privasea_ai

Verification: EZKL, @ModulusLabs , @gizatechxyz

Application layer: It is actually difficult to classify all applications. We can only list the most representative ones

Data Analysis

Agent

Market: @myshell_ai

Web3 knowledge chatbot:@qnaweb3

Help people do operations:@autonolas

Six,What kind of places are more likely to produce big projects?

First, similar to other fields, Infra is prone to big projects, especially decentralized models and decentralized computing power, and the author feels that its marginal cost is low.

Then, inspired by my conversation with @owenliang60, I feel that if a killer application can appear at the application layer, it will also become a top project.

Looking back at the history of big models, it was the killer application ChatGPT that pushed it to the forefront. It was not a major technical iteration, but an optimization for the Chat task. Perhaps in the future, there will be phenomenal applications like Stepn/Friendtech in the A+Web3 field. Let's wait and see.

Gain a broader understanding of the crypto industry through informative reports, and engage in in-depth discussions with other like-minded authors and readers. You are welcome to join us in our growing Coinlive community:https://t.me/CoinliveSG

Add Comment

LoginLeave your comments

0 Comments

Earliest

Load more comments

More news about csci561代写

Apr 24
ZhuSu: Buy ETH when gas fee is low, sell when gas fee is high
Bullish
Bearish
Apr 10
2,561 BTC 이체... 크라켄 → 익명
Bullish
Bearish
Dec 14
Immutable zkEVM and Immutable Passport will cancel gas fees for players
Bullish
Bearish
Dec 07
Wintermute and Mandala Capital deposited 5.61 million BLUR to OKX in the past 14 hours
Bullish1
Bearish1
Nov 27
2,561 BTC 이체... 익명 → 코인베이스
Bullish
Bearish
Nov 23
Uniswap DAO Delegates Move Closer to Voting Power Boost
Bullish
Bearish
Nov 07
ARK Invest sold approximately $3.76 million in GBTC yesterday and bought $5.61 million in Block shares.
Bullish
Bearish
Aug 20
The Nautilus mainnet is online, and Gas fees will be distributed to users for free
Bullish
Bearish
Aug 17
Sei Labs co-founder Jay Jog: Sei is a fundamental rewrite of the underlying infrastructure, not an iteration
Bullish
Bearish
Nov 28
The address starting with 0x561f once again traded 13 million USDC for USDT
Bullish
Bearish

New production relations empower the era of artificial intelligence

I.Objective of this article

II. Basic concepts

Three,Steps in AI development

Fourth, The latest application of large models

V.Classification of Web3 projects on the A chain

Six,What kind of places are more likely to produce big projects?

More news about csci561代写

More news about csci561代写

Written after the GTC conference. Can Web3 save AI computing power?

Celestia founder: 100 days online to continue writing DA chapter

Ten Years of Sharpening a Sword Written after the U.S. Bitcoin Spot ETF was Approved

写Celestia脚本后有感：Cosmos很多工作没搞好

写在RUNEs符文大爆发前夕

律师如何使用AIGC写文章？以微软Copilot为例

AI 代写“有助科研”还是“助长作弊”？

Stack App Aims to Get Teens Involved in Crypto, Safely

Written on the eve of Ethereum's "The Merge"

A Love Letter to The DAO: I'm Exhausted In Love With The DAO

New production relations empower the era of artificial intelligence

I.Objective of this article

II. Basic concepts

Three,Steps in AI development

Fourth, The latest application of large models

V.Classification of Web3 projects on the A chain

Six,What kind of places are more likely to produce big projects?

More news about csci561代写

More news about csci561代写

Written after the GTC conference. Can Web3 save AI computing power?

Celestia founder: 100 days online to continue writing DA chapter

Ten Years of Sharpening a Sword Written after the U.S. Bitcoin Spot ETF was Approved

写Celestia脚本后有感：Cosmos很多工作没搞好

写在RUNEs符文大爆发前夕

律师如何使用AIGC写文章？以微软Copilot为例

AI 代写“有助科研”还是“助长作弊”？

Stack App Aims to Get Teens Involved in Crypto, Safely

Written on the eve of Ethereum&#39;s &quot;The Merge&quot;

A Love Letter to The DAO: I&#39;m Exhausted In Love With The DAO

Written on the eve of Ethereum's "The Merge"

A Love Letter to The DAO: I'm Exhausted In Love With The DAO