Gemini 3 Flash Outperforms Gemini 3 Professional and GPT 5.2 In These Key Benchmarks

Productivity

Gemini 3 Flash Outperforms Gemini 3 Professional and GPT 5.2 In These Key Benchmarks

12월 18, 2025

[ad_1]

The AI wars proceed to warmth up. Simply weeks after OpenAI declared a “code pink” in its race in opposition to Google, the latter launched its newest light-weight mannequin: Gemini 3 Flash. This explicit Flash is the most recent in Google’s Gemini 3 household, which began with Gemini 3 Professional, and Gemini 3 Deep Assume. However whereas this newest mannequin is supposed to be a lighter, inexpensive variant of the present Gemini 3 fashions, Gemini 3 Flash is definitely fairly highly effective in its personal proper. Actually, it beats out each Gemini 3 Professional and OpenAI’s GPT-5.2 fashions in some benchmarks.

Light-weight fashions are sometimes meant for extra fundamental queries, for lower-budget requests, or to be run on lower-powered {hardware}. Which means they’re typically quicker than extra highly effective fashions that take longer to course of, however can do extra. In line with Google, Gemini 3 Flash combines one of the best of each these worlds, producing a mannequin with Gemini 3’s “Professional-grade reasoning,” with “Flash-level latency, effectivity, and price.” Whereas that doubtless issues most to builders, basic customers also needs to discover the enhancements, as Gemini 3 Flash is now the default for each Gemini (the chatbot) and AI Mode, Google’s AI-powered search.

Gemini 3 Flash efficiency

You may see these enhancements in Google’s reported benchmarking stats for Gemini 3 Flash. In Humanity’s Final Examination, an instructional reasoning benchmark that exams LLMs on 2,500 questions throughout over 100 topics, Gemini 3 Flash scored 33.7% with no instruments, and 43.5% with search and code execution. Examine that to Gemini 3 Professional’s 37.5% and 45.8% scores, respectively, or OpenAI’s GPT-5.2’s scores of 34.5% and 45.5%. In MMMU-Professional, a benchmark that check a mannequin’s multimodal understanding and reasoning, Gemini 3 Flash bought the highest rating (81.2%), in comparison with Gemini 3 Professional (81%) and GPT-5.2 (79.5). Actually, throughout the 21 benchmarking exams Google highlights in its announcement, Gemini 3 Flash has the highest rating in three: MMMU-Professional (tied with Gemini 3 Professional), Toolathlon, and MMMLU. Gemini 3 Professional nonetheless takes the primary spot on essentially the most exams right here (14), and GPT-5.2 topped eight exams, however Gemini 3 Flash is holding its personal.

Google notes that Gemini 3 Flash additionally outperforms each Gemini 3 Professional and your entire 2.5 collection within the SWE-bench Verified benchmark, which exams the mannequin’s coding agent capabilities. Gemini 3 Flash scored a 78%, whereas Gemini 3 Professional scored 76.2%, Gemini 2.5 Flash scored 60.4%, and Gemini 2.5 Professional scored 59.6%. (Notice that GPT-5.2 scored one of the best of the fashions Google mentions on this announcement.) It is a shut race, particularly when you think about this can be a light-weight mannequin scoring alongside these firm’s flagship fashions.

Gemini 3 Flash value

Which may current an fascinating dilemma for builders who pay to make use of AI fashions of their packages. Gemini 3 Flash prices $0.50 per each million enter tokens (what you ask the mannequin to do), and $3.00 per each million output tokens (the consequence the fashions returns out of your immediate). Examine that to Gemini 3 Professional, which prices $2.00 per each million enter tokens, and $12.00 per each million output tokens, or GPT-5.2’s $3.00 and $15.00 prices, respectively. For what it is value, it is not as low-cost as Gemini 2.5 Flash ($0.30 and $2.50), or Grok 4.1 Quick for that matter ($0.20 and $0.50), but it surely does outperform these fashions in Google’s reported benchmarks. Google notes that Gemini 3 Flash makes use of 30% fewer tokens on common than 2.5 Professional, which is able to save on value, whereas additionally being 3 times quicker.

When you’re somebody who wants LLMs like Gemini 3 Flash to energy your merchandise, however you do not wish to pay the upper prices related to extra highly effective fashions, I might picture this newest light-weight mannequin trying interesting from a monetary perspective.

How the typical person will expertise Gemini 3 Flash

Most of us utilizing AI aren’t doing in order builders who want to fret about API pricing. The vast majority of Gemini customers are doubtless experiencing the mannequin by way of Google’s client merchandise, like Search, Workspace, and the Gemini app.

What do you assume up to now?

Beginning at the moment, Gemini 3 Flash is the default mannequin within the Gemini app. Google says it may possibly deal with many duties “in just some seconds.” Which may embrace asking Gemini for tips about bettering your golf swing primarily based on a video of your self, or importing a speech on a given historic matter and requesting any information you might need missed. You can additionally ask the bot to code you a functioning app from a collection of ideas.

You will additionally expertise Gemini 3 Flash in Google Search’s AI Mode. Google says the brand new mannequin is healthier at “parsing the nuances of your query,” and thinks by way of every a part of your request. AI Mode tries to return a extra full search consequence by scanning a whole bunch of web sites directly, and placing collectively a abstract with sources on your reply. We’ll must see if Gemini 3 Flash improves on earlier iterations of AI Mode.

I am somebody who nonetheless does not discover a lot use for generative AI merchandise of their day-to-day lives, and I am not fully positive Gemini 3 Flash goes to alter that for me. Nevertheless, the stability of efficiency good points with the price to course of that energy is fascinating, and I am significantly intrigued to see how OpenAI responds.

Gemini 3 Flash is on the market to all customers beginning at the moment. Along with basic customers in Gemini and AI Mode, builders will discover it within the Gemini API in Google AI Studio, Gemini CLI, and Google Antigravity, the corporate’s new agentic growth platform. Enterprise customers can use it in Vertex AI and Gemini Enterprise.

[ad_2]

Gemini 3 Flash efficiency

Gemini 3 Flash value

How the typical person will expertise Gemini 3 Flash

LEAVE A REPLY Cancel reply