Gemini 2.0 - A Model for”Everything”
Google unveiled Gemini 2.0, an experimental AI model heralded as a transformative step toward a "universal assistant."
Capable of autonomously navigating websites, the model aims to empower users to develop advanced AI agents.
CEO Sundar Pichai described it as Google's most capable creation yet, designed for the "agentic era."
This launch underscores Google's commitment to leading the AI race amidst fierce competition from industry giants like Meta and Microsoft.
The Model Will Be Rolled Out Across Products
Pichai announced that Gemini 2.0, featuring advanced multimodal capabilities, will soon be integrated into its suite of products, supporting native image and audio output.
This follows the December 2023 release of Gemini 1.0, touted as the first "natively multimodal" model capable of processing and responding to text, video, images, audio, and code queries.
The latest version reflects Google's drive to remain at the forefront of the competitive AI landscape.
Pichai noted:
“If Gemini 1.0 was about organizing and understanding information, Gemini 2.0 is about making it much more useful.”
Gemini 2.0, which debuts nearly 10 months after the intermediate 1.5 model, remains in experimental preview.
Currently, only the smaller, cost-effective 2.0 Flash variant is available, primarily to developers and testers.
Demis Hassabis, CEO of Google DeepMind, described the launch as a significant milestone for the company, despite its limited initial release.
Hassabis explained:
“It's as good as the current Pro model is. So you can think of it as one whole tier better, for the same cost efficiency and performance efficiency and speed. We're really happy with that.”
Other Gemini users still have access to 1.5 Flash, recognised for its speed and efficiency.
Not Only Gemini 2.0, Google Announces a Plethora of Features
Google has outlined ambitious plans for its latest AI model, Gemini 2.0, which Pichai says will enhance the AI Overviews feature already available to one billion users.
Pichai noted that AI Overviews is rapidly becoming one of Google's most popular search tools.
With the integration of Gemini 2.0, the feature will be capable of handling complex, multi-step queries, such as solving mathematical equations and addressing multimodal questions.
Limited testing for the model began this week, but broader access to its reasoning capabilities is slated for early next year.
The model operates on Google's 6th-generation AI chip, Trillium, which debuted alongside the announcement.
According to the company, Trillium offers four times the performance and is 67% more energy-efficient than its predecessor.
Google Cloud customers now have access to this cutting-edge hardware.
Among the new features powered by Gemini 2.0 is "Deep Research," an advanced research assistant available within Gemini Advanced.
This tool leverages reasoning and long-context capabilities to compile detailed research reports.
Google DeepMind CEO Demis Hassabis remarked that these advancements set the stage for a transformative 2025:
“We really see 2025 as the true start of the agent-based era.”
Google also unveiled Project Mariner, an experimental Chrome extension capable of autonomously navigating web browsers, and introduced Jules, an AI agent designed to help developers identify and fix coding errors.
Another Gemini-powered feature, described as an "Easter egg" by Hassabis, is a gaming assistant capable of analysing a user's screen and improving gameplay—a testament to the model's true multimodal capabilities.