research

Google launched its deepest AI research agent yet — on the same day OpenAI dropped GPT-5.2

TechCrunch

4 min read

1 views

0 likes

Google launched its deepest AI research agent yet — on the same day OpenAI dropped GPT-5.2

Summarize this article with:

Google released on Thursday a “reimagined” version of its research agent Gemini Deep Research based on its much-ballyhooed state-of-the-art foundation model, Gemini 3 Pro. This new agent isn’t just designed to produce research reports – although it can still do that. It now allows developers to embed Google’s SATA-model research capabilities into their own apps. That capability is made possible through Google’s new Interactions API, which is designed to give devs more control in the coming agentic AI era. The new Gemini Deep Research tool is an agent equipped to synthesize mountains of information and handle a large context dump in the prompt. Google says it’s used by customers for tasks ranging from due diligence to drug toxicity safety research. Google also says it will soon be integrating this new deep research agent into services, including Google Search, Google Finance, its Gemini App and its popular NotebookLM. This is another step towards preparing for a world where humans don’t Google anything anymore, their AI agents do. The tech giant says that Deep Research benefits from Gemini 3 Pro’s status as its “most factual” model that is trained to minimize hallucinations during complex tasks. AI hallucinations – where the LLM just makes stuff up – are an especially crucial issue for long-running, deep reasoning agentic tasks, in which many autonomous decisions are made over minutes, hours, or longer. The more choices an LLM has to make, the greater the chance that even one hallucinated choice will invalidate the entire output. To prove its progress claims, Google has also created yet another benchmark (as if the AI world needs another one). The new benchmark is unimaginatively named DeepSearchQA, and is intended to test agents on complex, multi-step information-seeking tasks. Google has open sourced this benchmark. Techcrunch event Join the Disrupt 2026 Waitlist Add yourself to the Disrupt 2026 waitlist to be first in line when Early Bird tickets drop. Past Disrupts have brought Google Cloud, Netflix, Microsoft, Box, Phia, a16z, ElevenLabs, Wayve, Hugging Face, Elad Gil, and Vinod Khosla to the stages — part of 250+ industry leaders driving 200+ sessions built to fuel your growth and sharpen your edge. Plus, meet the hundreds of startups innovating across every sector. Join the Disrupt 2026 Waitlist Add yourself to the Disrupt 2026 waitlist to be first in line when Early Bird tickets drop. Past Disrupts have brought Google Cloud, Netflix, Microsoft, Box, Phia, a16z, ElevenLabs, Wayve, Hugging Face, Elad Gil, and Vinod Khosla to the stages — part of 250+ industry leaders driving 200+ sessions built to fuel your growth and sharpen your edge. Plus, meet the hundreds of startups innovating across every sector. San Francisco | October 13-15, 2026 WAITLIST NOW It also tested Deep Research on Humanity’s Last Exam, a much-more interestingly named, independent benchmark of general knowledge filled with impossibly niche tasks; and BrowserComp, a benchmark for browser-based agentic tasks. As you might expect, Google’s new agent bested the competition on its own benchmark, and Humanity’s. However, OpenAI’s ChatGPT 5 Pro was a surprisingly close second all the way around and slightly bested Google on BrowserComp. But those benchmark comparisons were obsolete almost the moment Google published them. Because on the same day, OpenAI launched its highly anticipated GPT 5.2 — codenamed Garlic. OpenAI says its newest model bests its rivals — especially Google — on a suite of the typical benchmarks, including OpenAI’s homegrown one. Perhaps one of the most interesting parts of this announcement was the timing. Knowing that the world was awaiting the release of Garlic, Google dropped some AI news of its own. Topics AI, deep research, gemini ai, google ai, TC Julie Bort Venture Editor Julie Bort is the Startups/Venture Desk editor for TechCrunch. You can contact or verify outreach from Julie by emailing julie.bort@techcrunch.com or via @Julie188 on X.

View Bio Dates TBD Locations TBA Plan ahead for the 2026 StrictlyVC events. Hear straight-from-the-source candid insights in on-stage fireside sessions and meet the builders and backers shaping the industry. Join the waitlist to get first access to the lowest-priced tickets and important updates.

Waitlist Now Most Popular Marco Rubio bans Calibri font at State Department for being too DEI Julie Bort ElevenLabs just hit a $6.6B valuation. Its CEO says the real money isn’t in voice anymore.

Theresa Loconsolo SpaceX reportedly planning 2026 IPO with $1.5T valuation target Sean O'Kane Claude Code is coming to Slack, and that’s a bigger deal than it sounds Rebecca Bellan Creator IShowSpeed sued for allegedly punching, choking viral humanoid Rizzbot Dominic-Madori Davis SpaceX reportedly in talks for secondary sale at $800B valuation, which would make it America’s most valuable private company Connie Loizos Meta acquires AI device startup Limitless Sarah Perez

Read Original

Source Information

Source: TechCrunch

Website: https://techcrunch.com/feed/