Author: Wuji, Special Translator for Tencent Technology
On November 19th, Beijing time, following Google's release of the Gemini 3 series models, The New York Times' technology podcast, Hard Fork, released a special program featuring interviews by hosts Kevin Roose and Casey Newton with Google DeepMind CEO Demis Hassabis and Google Gemini team leader Josh Woodward.
... This interview focuses on Google's newly released flagship AI model, Gemini 3 (actually the Pro version of the Gemini 3.0 series). This is a milestone release widely regarded by the industry as Google's first major breakthrough in regaining its technological and product leadership after the failure of Bard and the subsequent catch-up phases with Gemini 1.x and 2.x. The two executives elaborated on Gemini 3's breakthroughs in multi-step inference, code generation (especially front-end and "ambient coding"), and dynamic generation of interactive interfaces, emphasizing that Google has rapidly deployed its most powerful model to products with billions of users, such as Search, Gmail, and Workspace, reshaping its competitive advantage. Key takeaways from the interview: Gemini 3 is fully in line with expected development trajectory; it will still take 5 to 10 years and 1 to 2 major research breakthroughs to reach Artificial General Intelligence (AGI); Google's full-stack advantages in efficiency, cost, and distribution ensure its success in any market environment; while an AI bubble exists, Google possesses the dual guarantee of short-term monetization and a long-term trillion-dollar new market.
The following is a condensed version of the interview
Rhodes: Casey, we're airing a special episode today, focusing on the release of Gemini 3.
Newton: Yes, Kevin. This model has been highly anticipated in Silicon Valley's AI community, and we're finally getting to experience the real product firsthand.
Rhodes: There are two main reasons why we broke our usual Friday release schedule to record this special episode. First, we had the opportunity to interview two key figures in Google's AI team (DeepMind CEO Hassabis and Gemini team VP Woodward). Second, the release of Gemini 3 has generated significant industry attention.
We've heard from multiple labs that this model has achieved breakthroughs in certain key areas, potentially posing a substantial threat to competitors. For the past two years, Google has been seen as a follower; the question now is: have they regained the lead? Newton: Before we delve into the interview, let's briefly review what we know. Google held a closed-door briefing before the launch, and the most notable new capabilities of Gemini 3 include: significantly enhanced coding and "ambient coding" capabilities; and a completely new interactive interface generation function. It no longer just outputs text, but directly generates customized interactive interfaces for users. For example, if a user asks about Van Gogh's life, the model will instantly generate a complete learning page including images, a timeline, and interactive elements; another example is generating a mortgage calculator for properties worth millions of dollars. These features mark a leap from "answering questions" to "building experiences." Rhodes: In all publicly available benchmark tests, Gemini 3 significantly outperformed Gemini 2.5 Pro. For example, on the interdisciplinary doctoral-level problem set known as "Humanity's Last Exam," the former scored only 21.6%, while the latter jumped to 37.5%. Google's overall message is: any task you can complete on ChatGPT, Claude, or other older versions of Gemini, you can do better on Gemini 3. Newton: They also showcased an early demo of Gemini Agent: the model can deeply integrate with the user's email inbox, understand all email content, automatically categorize and draft replies, and even help users completely empty their inbox. Furthermore, starting this week, Gemini 3 will be available in the Gemini App and Google Search's AI Mode; US college students will receive a free year of premium access. Google repeatedly emphasizes the keyword "Learn Anything," which effectively positions Gemini as the ultimate personalized education tool. Rhodes: Demis, Josh, welcome to Hard Fork. Two years ago, Sundar Pichai compared Bard to "a modified Honda Civic," racing against much stronger competitors. So, what kind of car is Gemini 3? Hassabis: I hope it's much faster than a Honda Civic. I'm not used to comparing it to a car; perhaps it's more like a professional drag racer. It's not designed for everyday driving or a circuitous track; it possesses pure, concentrated power for a specific goal. It represents the perfect combination of our cutting-edge research and scalable computing power, aiming to demonstrate unparalleled bursts of power in this race at the forefront of intelligence. Rhodes: That's interesting. Compared to all previous AI models, what exactly can Gemini 3 do on a concrete level? Please give us some quantifiable, practical examples. Woodward: Three points stand out. First, in multi-step reasoning, it can consider more steps simultaneously, elevating its reliability to a whole new level. Previous models often "lost their train of thought" or experienced hallucinations when reaching the 5th or 6th step of complex logical deduction, while Gemini 3 can reliably complete 10 to 15 steps of coherent reasoning tasks, such as complex tax planning, overall planning and booking for international travel, or comprehensive debugging of a massive system with millions of lines of code. Second, it will generate entirely new interactive interfaces on a large scale for the first time. Users will no longer need simple text answers, but rather customized software components. For example, if you ask it, "Design a dashboard that tracks all my portfolios," it will generate an interactive, functional dashboard interface in real time, instead of a bunch of text describing how to create a dashboard. Third, we've invested heavily in coding capabilities, especially front-end and "ambient coding," meaning it can generate fully functional, beautifully designed user interface code based on natural language prompts. Upcoming products like Google Antigravity will fully demonstrate this, with the model dynamically changing the layout and functionality of the user interface based on context. Newton: Many people believe that the "chat" use case is essentially solved for the average user. They can't even think of any new questions that would allow Gemini 3 to provide a significantly different answer from its predecessor. What's your take on this view? Woodward: I understand that. On the surface, the accuracy of basic question answering is already quite high. But the real difference lies in reliability, integration, and information presentation. Gemini 3's answers will be more concise, expressive, and easier to understand—changes most people will immediately perceive. More importantly, the model begins to deeply integrate with other user data sources, such as linking with other products within the Google ecosystem, truly transcending a simple question-and-answer model and becoming the user's "digital steward." It understands the context of your entire email inbox, so when drafting replies, it not only answers the question but also adjusts its tone and content based on your past style and your relationship with the recipient. Hassabis: I completely agree. Its reliability, style, and personality have been carefully refined, making it more concise and to the point. In scenarios like "ambience coding," it has already crossed the practicality threshold. This is a shift from a "smart assistant" to a "smart colleague." I personally plan to use it to revisit game programming during the Christmas holidays. It can now not only write functional code but also provide architectural suggestions in the early stages of design. Rhodes: Demis, in an interview with us this May, you predicted that AGI would still take 5 to 10 years and might require several major breakthroughs. Has Gemini 3 changed that timeline? Hassabis: Not at all. It perfectly aligns with the trajectory we've set over the past two years. In fact, since the launch of the Gemini series, our pace of progress has been the fastest in the industry. Gemini 3 is amazing, but still within expectations. To reach true general artificial intelligence, we still need one or two key breakthroughs in consistency, inference depth, memory mechanisms, and physical world modeling (such as the SIMA and Genie projects we are currently working on). We're currently operating on a "System 1 mindset" (fast, intuitive), but to achieve AGI, we must unlock a "System 2 mindset" (slow, deliberate, analytical). Furthermore, the model needs a long-term, selective memory mechanism, capable of recalling and applying specific interactions from weeks or months ago, rather than being limited to a narrow context window. Therefore, our 5- to 10-year forecast remains unchanged. Newton: Regarding model personality and user relationships, the industry is buzzing about "AI companions." What kind of relationship do you hope users will build with Gemini 3? Woodward: This is a very sensitive but important question. We position it as a "super tool" rather than an emotional companion; its core value is helping users efficiently complete daily tasks and improve productivity. Internally, we're focusing on a new metric: How many tasks have we helped you complete today? This is closer to the core value of the original Google Search—efficiency. We believe that positioning the model as an emotional partner poses both security risks and deviates from Google's core mission as a provider of information and tools. Rhodes: Was abandoning the viral growth opportunity of "sexual partners" a major strategic mistake? Woodward: No comment. Our security team has strict guidelines and principles in place. Rhodes: Competitors have been noticeably nervous in the past few weeks. Do you think Google is currently leading the AI race? Hassabis: The current environment is the most competitive in history. The only thing that really matters is the speed of progress, and we are very happy with that. We have never lost our research leadership; now we're just catching up in product deployment. Our competitors are excellent in research, but they cannot replicate our advantages in large-scale distribution and vertical integration. We are integrating Gemini into products with billions of users, including Maps, YouTube, Android, Search, and Workspace. This distribution network and terminal data feedback loop is an insurmountable moat. Furthermore, our full-stack advantage on customized TPU chips allows our training costs and efficiency to far exceed those of competitors relying on external GPU resources. Newton: What are your thoughts on the debate about the law of scale and diminishing returns? Some argue that the larger the model, the lower the marginal benefit of performance improvements. Hassabis: This is an ongoing debate. We are very satisfied with the improvements in Gemini 3 compared to 2.5; they are entirely in line with our expectations. The returns are not as exponentially explosive as in the early days, but the incremental practicality and reliability improvements it brings still far outweigh our marginal costs, making it worthwhile for us to invest heavily. Until the one or two research breakthroughs required to reach AGI arrive, continuously driving performance through the largest possible foundational models remains the most effective strategy. We believe the law of scale still holds true. Rhodes: Are we in an AI bubble? Hassabis: That's too binary a question. There are indeed bubbles in certain areas (e.g., companies with billions in seed funding but no actual products, only concepts), where valuations are disproportionate to actual revenue. But Google simultaneously possesses short-term monetization (search, Workspace, cloud TPUs) and long-term trillion-dollar new tracks (robotics, gaming, drug discovery, materials science, etc.). For example, our specialized models like AlphaFold are creating real value in drug discovery, a trillion-dollar market unrelated to consumer AI valuations. Regardless of whether a short-term bubble exists, we will prevail: seizing opportunities during booms and being more resilient during contractions thanks to our full-stack advantages and strong cash flow. Newton: If it were Thanksgiving and someone wanted to steer the conversation away from politics, what features on the Gemini 3 would you suggest to wow the crowd? Woodward: I don't know if it can save Thanksgiving, but it'll bring laughter. Take a selfie, then let the Gemini 3 edit it like crazy. Our Gemini's image processing capabilities are still the best in the world. You can instantly transform a family photo into any hilarious scene, style, or era. It'll definitely elicit laughter from everyone. Then, when you show them how it can help you write a proper resignation letter or generate a customized holiday recipe calculator, they'll naturally explore other new features.