Author: Stephen Katte, CoinTe; Translator: Tao Zhu, Golden Finance
Google's artificial intelligence research lab DeepMind said that its newly released artificial intelligence model Gemini 2.0 will become the basis for building more advanced artificial intelligence agents.
Google DeepMind CEO Demis Hassabis and CTO Koray Kavukcuoglu said in a blog post on December 11 that the artificial intelligence agents powered by Gemini 2.0, released on December 11, can understand complex instructions, plan, reason, take actions across websites, and even assist in developing video game strategies.
Hassabis and Kavukcuoglu said: "The practical application of artificial intelligence agents is a research area full of exciting possibilities.
"We are exploring this new field with a series of prototypes that can help people complete tasks and get things done. ”
According to Hassabis and Kavukcuoglu, there are currently multiple experimental AI assistant projects powered by Gemini, each with different capabilities.
One project, called Deep Research, helps users explore complex topics by creating a multi-step research plan by searching the web and then generating a long report on the findings.
Project Astra is a general AI assistant that focuses on everyday tasks, such as providing suggestions and opinions based on user-provided prompts, such as how to do laundry or more information about a landmark.
Project Mariner focuses on creating an AI agent that can control your Chrome browser, move the cursor, click buttons, fill out forms, and navigate websites.
According to Hassabis and Kavukcuoglu, these projects are "still in the early stages of development," but they hope to make them "broadly available in future products" after testing and further development.
"It's early days, but Mariner is a great choice for developers who are looking to build AI agents that can do things like search, browse, and do other things." The project demonstrates that navigation in a browser is already technically possible, although currently it is not always accurate and is slow to complete tasks, but this will improve rapidly over time. ”
Meanwhile, Project Jules is being developed as a developer assistant that can be integrated directly into GitHub workflows and help with tasks such as coding and planning.
Hassabis and Kavukcuoglu said they have also used Gemini 2.0 to build agents for video games that can advise players on their next move in real-time conversations and search for “rich gaming knowledge” online.
“We are working with leading game developers such as Supercell to explore how these agents work, testing their ability to interpret the rules and challenges of a variety of games, from strategy games to farming simulators,” they said.
In November, Marc Benioff, CEO of US cloud computing software company Salesforce, said that the future of AI lies in autonomous agents, not large language models (LLMs).
“I actually think we’ve reached the ceiling of the LLM now,” he said on November 23. Nvidia is also focused on positioning itself at the forefront of the trend. ”We’re seeing the number of AI-native companies continue to grow. And certainly we’re starting to see enterprise adoption of agent AI really as the latest wave,” Nvidia CEO Jensen Huang said during the company’s third-quarter earnings call in November.
In addition, Hassabis and Kavukcuoglu said the team is “experimenting with agents that can help in the physical world” through robotics. For now, Google’s AI agents are only being released to testers and developers.