Author: jolestar Source: X, @jolestar
Last week, I played around with AI Agents. The day before yesterday, I attended an event held by ai16z in Beijing. I wanted to see what AI Agents can actually do now and think about what they can do in the future.
The current situation of AI Agents reminds me of the meme picture where a person is hidden in a vending machine. Everyone has imagined that AI Agents have begun to have autonomous consciousness, but in reality, there is actually a developer hidden in AI Agents. (Please imagine the picture here. I tried to let AI generate this picture, but found that AI could not understand "hide")
Basic working method of AI Agent framework
The AI Agent framework currently plays the role of a glue, gluing the client (Twitter, Discord, Telegram, etc.) and various plug-ins (chains, etc.), and then the framework provides a basic library (memory storage, session isolation, context generation), etc., and then connects to various AI platform interfaces.
How the AI Agent framework is combined with applications and business scenarios
Since the popularity of AI last year, various platforms and tools have emerged. The most critical issue is how to combine AI with applications. Some AI platforms try to provide plug-ins, some create workflow models, and some traditional applications embed AI in applications. But the key here is: 1. Where is the interactive entry point for the application? 2. How does AI integrate with existing business logic.
The interactive entry point for the application provided by each AI platform to users is a dialog box similar to a chat window. Obviously, everyone thinks that the way to interact with AI applications should be an "anthropomorphic" way. The clever thing about AI Agent is that it directly connects to all open IM and social systems, which is obviously easier to accept than creating a new one.
How does AI integrate with existing business logic. The solution provided by AI Agent is to allow developers to integrate AI decisions into business scenarios. Programming languages require certainty, and the if condition can only be true or false, and cannot handle fuzzy business logic. Through AI, complex logic can be converted into precise conditions, and then it can be seamlessly integrated into business scenarios.
For example, the function of replying messages in a group, traditional IM Bots need some clear message instructions to trigger, but AI can implement a method shouldReplyMessage, give it context, and it returns true or false.
The main role of AI in business logic scenarios is:
1. "Intent" discovery: Through the instructions in the prompt words, let AI discover the "intent" in the user's text message according to the context, and map the intent to specific code.
2. Assist decision-making: Use AI to convert fuzzy and complex conditions into definite true/false or enumeration types, and then combine them into business logic.
Seeing this, many people may be disappointed with AI Agent. Many people think that AI Agent can do everything after being taught. In fact, due to the problem of contextual limitations of large models, there is no way (at least currently) to create an all-purpose AI that can do anything. But the good news is that programmers don't have to worry about losing their jobs. There are still a lot of programmers behind AI, and someone is needed to stack if else, but the key difference is that the business boundaries that programs can handle are expanding.
Two types of AI Agents
At the event, @shawmakesmagic was asked a question. The market has two expectations for AI Agents: 1. AI Agents play a role, have their own ID, brand, and provide services to users. 2. Users have personal AI Agents, which are equivalent to personal assistants, and can assist users in handling some business. Which of these two AI Agents will be more popular? He thinks both directions will be good, and it is possible to combine them.
Now the main exploration in the market is still the first direction. This direction is similar to the AI agentization of services. In the future, there may be no App interface. Apps will all be AI Agentized and personified. The second direction is the agentization of application clients. In the future, application clients will be a plug-in of assistant agents. The local data of the application will become part of the Agent memory library. At the same time, this plug-in is also responsible for communicating with the service agent in the cloud. This is a new application architecture model that will change the entire infrastructure.
AI Agent Requirements for Infrastructure
1. The infrastructure must be permissionless, otherwise the AI Agent will be restricted by various anti-attack strategies, and the service should use economic cost (Gas) to prevent attacks. In this regard, platforms with poor openness will face a greater impact, and the enthusiasm for open platforms in the early days of Web2 will be rekindled.
2. AI Agents need to be able to operate funds to pay to solve the above problems.
That is to say, future services, whether or not based on blockchain, will need to support Crypto private key mode authentication and Crypto-based payments.
AI Agent Combination with Chain
In addition to the two points mentioned above, how AI Agent can be combined with chain is a direction that everyone is exploring. At the event, I talked with @Mikkke_acc about focEliza that he is working on. Of the two types of AI Agents mentioned above, at least the first one requires a running or verification environment provided by the chain. Because once an AI Agent provides services to the outside world, there will be trust issues, and the role it plays is actually the same as that of a smart contract.
There was a controversy about the name "smart contract" back then. It is just a piece of code, so how can it be "smart"? AI can make smart contracts truly worthy of the name. The problem is how to call the AI interface in the smart contract environment. If it is still a long way to run a large model in a verifiable environment, using a solution similar to Oracle is a more feasible path.
There will be a lot of requirements around AI Agent. How to obtain public knowledge of AI Agent? How does AI Agent judge facts? How does AI Agent identify the same user on different platforms? How to store "memory" in smart contracts? If I have multiple devices, each with an AI Agent installed, how do they share memory?
You will find that the "data on-chain", relationship on-chain, DID, P2P network, etc. that were originally done in Web3 have new meanings and scenarios.
Conclusion
Reuse the conclusion I shared about AI and blockchain in 21 years. The Internet that is more friendly to AI is also the Internet that is more friendly to humans. It was just a fantasy then, but now the future is here.