Gemini 3.1 Ultra

Google's flagship multimodal model — a 2-million-token context window and the first mainstream model to reason over video, audio and text in one pass.

Gemini 3.1 Ultra is Google's flagship AI model, released in April 2026. Its defining features are a 2 million token context window — the largest of any publicly available model — and native multimodal processing: it handles video, audio and text together with no transcription step in between.

Gemini 3.1 Ultra specs at a glance

MakerGoogle
ReleasedApril 2026
Context window2,000,000 tokens — largest publicly available, 10x the prior generation
ModalitiesVideo, audio and text processed natively in one pass
Standout featureNative multimodal reasoning + sandboxed Python execution
TiersGemini 3.1 Ultra · Gemini 3.1 Pro
API pricing (Pro)~$2 / $12 per 1M input / output tokens
Consumer plan$19.99/month — includes 3.1 Pro & 20 Deep Research reports/day
Best atMultimodal understanding, very long documents, media analysis
2M
token context window
3-in-1
video + audio + text, one pass
$19.99
per month, consumer plan

What makes Gemini 3.1 different

The headline with Gemini 3.1 Ultra is how it handles multimodal input. Most models convert audio or video into text first, then reason over that transcript. Gemini 3.1 Ultra processes video, audio and text simultaneously — no transcription middle-step — which Google describes as a first for a mainstream commercial model.

In practice that means less information is lost between formats: tone, timing, pacing and visual context survive into the model's reasoning instead of being flattened into plain text. For captioning, media analysis, meeting summarisation or anything where how something was said matters as much as what was said, that is a structural advantage.

Gemini 3.1 also adds native sandboxed code execution. It can run Python in a secure sandbox without a third-party plugin — writing code, executing it, observing the output and revising — which makes it far more reliable for data analysis and computational tasks.

The 2 million token context window

Gemini 3.1's 2M-token context window is the single most consequential design decision in the model — a 10x jump over the previous generation and the largest window available in any public model in 2026.

Why it matters

A 2M-token window can hold roughly 1.5 million words at once — entire codebases, long legal contracts, or hundreds of pages of research — without chunking or retrieval workarounds. That makes Gemini 3.1 the default pick for whole-document reasoning.

Pricing and access

Gemini 3.1 is one of the cheaper frontier models to run. API pricing for the Pro tier sits around $2 per million input tokens and $12 per million output tokens — roughly 2.5x cheaper than GPT-5.5 on the same workload. Google had not published final Ultra-tier API pricing at launch.

On the consumer side, $19.99/month unlocks Gemini 3.1 Pro plus 20 Deep Research reports per day and generous cloud storage — competitive with ChatGPT Plus and Claude Pro at the same price point. Developers reach the model through the Gemini API and Google AI Studio.

Who should use Gemini 3.1 Ultra

For pure agentic coding, GPT-5.5 still holds a measurable benchmark lead — see the comparison below.

The wider Gemini line-up

Alongside the Ultra and Pro tiers, Google also ships Gemma 4 — an open-source model released 2 April 2026 under an Apache 2.0 licence, aimed at developers who want frontier-level intelligence they can run on their own infrastructure. Where Gemini 3.1 is the hosted flagship, Gemma 4 is the self-hostable companion.

How Gemini 3.1 compares

Try Gemini

Ad slot — AdSense in-article unit

Frequently asked questions

When was Gemini 3.1 Ultra released?

Google released Gemini 3.1 Ultra in April 2026, alongside the Gemini 3.1 Pro tier.

How big is Gemini 3.1's context window?

Gemini 3.1 has a 2 million token context window — the largest of any publicly available model in 2026 and a 10x increase over the previous generation.

What makes Gemini 3.1 Ultra different?

Gemini 3.1 Ultra processes video, audio and text together natively, with no transcription middle-step, so tone, timing and visual context survive into its reasoning. It can also run Python code in a sandboxed environment natively.

How much does Gemini 3.1 cost?

Gemini 3.1 Pro is priced at roughly $2 per million input tokens and $12 per million output tokens. Consumer access is $19.99/month, which includes Gemini 3.1 Pro and 20 Deep Research reports per day.

Is Gemini 3.1 better than GPT-5.5?

Gemini 3.1 leads on context window (2M vs 1M tokens), native multimodal processing and price. GPT-5.5 leads on coding benchmarks — 58.6% vs 54.2% on SWE-bench Pro.

Can Gemini 3.1 run code?

Yes. Gemini 3.1 can run Python in a sandboxed environment natively — it writes code, executes it, observes the output and revises, without a third-party code-interpreter plugin.