Gemma 4

Google's open model family — released 2 April 2026 under a fully permissive Apache 2.0 licence, with versions that run on everything from a phone to a workstation.

Compiled by AI Model Hub · Last updated 17 May 2026 · Sourced from Google's release materials and independent benchmark trackers

Gemma 4 is Google's open model family, released on 2 April 2026 and built on the Gemini 3 architecture. It ships in four sizes from 2B to 31B parameters, supports multimodal input, and is the first Google open model released under the fully permissive Apache 2.0 licence.

On this page

Specs at a glance
The four model sizes
Apache 2.0 licence
Benchmarks
Architecture
Who should use it
How it compares
FAQ

Gemma 4 specs at a glance

Maker	Google
Released	2 April 2026
Licence	Apache 2.0 — fully permissive, open weights
Base architecture	Built on Gemini 3
Sizes	E2B (2B) · E4B (4B) · 26B MoE · 31B dense
Modalities	Text + images (all sizes); audio (E2B, E4B)
Languages	140+
Best at	Local / self-hosted deployment, edge & mobile inference

89.2%

AIME 2026 (31B dense)

4 sizes

phone to workstation

140+

languages supported

The four Gemma 4 model sizes

Gemma 4 is a family, not a single model. Each size targets a different deployment footprint, so you pick the model that fits your hardware rather than the other way round.

Variant	Parameters	Runs on
E2B	2B	Phones (text, image, audio)
E4B	4B	Edge devices (text, image, audio)
26B MoE	26B total · ~4B active	Consumer GPUs
31B dense	31B	Workstations

Key takeaway

The 26B Mixture-of-Experts variant delivers about 97% of the 31B dense model's performance while activating only ~4B parameters per inference — making frontier-level quality practical on a single consumer GPU.

The Apache 2.0 licence

The licence change is the headline. Gemma 4 is the first Google open model released under Apache 2.0 — a fully permissive licence with no monthly-active-user caps, no acceptable-use policy enforced by the model creator, and complete freedom to modify, redistribute and commercialise. Earlier Gemma releases carried Google's custom terms; Apache 2.0 removes the legal friction that kept some companies away, which makes Gemma 4 a genuine watershed for the open model ecosystem.

Gemma 4 benchmarks

For its size, Gemma 4 punches well above its weight — the 31B dense model competes with systems many times larger.

Benchmark	Gemma 4 (31B dense)	What it measures
AIME 2026	89.2%	Competition mathematics
LiveCodeBench	80.0%	Real-world coding tasks
MMLU Pro	85.2%	Broad expert-level knowledge
Arena AI	#3 ranking	Human-preference leaderboard

Architecture

Gemma 4 is built on the Gemini 3 architecture and uses a hybrid attention mechanism that alternates between local sliding-window attention (512-1024 tokens) and global full-context attention, plus per-layer embeddings for deeper representation. The 26B variant adds a Mixture-of-Experts optimisation, activating only ~4B of its parameters per token — the trick that lets a large model run cheaply.

Who should use Gemma 4

Developers who want to self-host — open Apache 2.0 weights, no API fees, full control.
Edge and mobile builders — E2B and E4B run on phones and edge hardware, with audio support.
Cost-conscious teams with a single GPU — the 26B MoE variant brings near-frontier quality to consumer hardware.
Multilingual products — 140+ language coverage is among the broadest of any open model.

If you need the absolute frontier of capability or native video reasoning, Google's hosted Gemini 3.1 Ultra is the stronger choice — Gemma 4's trade is portability and licence freedom, not peak performance.

How Gemma 4 compares

Get Gemma 4

Frequently asked questions

When was Gemma 4 released?

Google released Gemma 4 on 2 April 2026. It is Google's first fully permissive open model, shipped under the Apache 2.0 licence.

What sizes does Gemma 4 come in?

Gemma 4 comes in four sizes: E2B (2B parameters, for phones), E4B (4B, for edge devices), a 26B Mixture-of-Experts variant (about 4B active, for consumer GPUs), and a 31B dense model for workstations.

Is Gemma 4 free for commercial use?

Yes. Gemma 4 is released under the Apache 2.0 licence with no monthly active user caps, no acceptable-use restrictions, and full freedom to modify, redistribute and commercialise.

How good is Gemma 4?

The 31B dense Gemma 4 model scores 89.2% on AIME 2026, 80.0% on LiveCodeBench and 85.2% on MMLU Pro, and ranks #3 on Arena AI — competitive with models many times its size.

Can Gemma 4 run on a single GPU?

Yes. Gemma 4's 26B Mixture-of-Experts variant activates only about 4B parameters per inference and fits on a single consumer GPU, while the smaller E2B and E4B models run on phones and edge devices.

How many languages does Gemma 4 support?

Gemma 4 supports over 140 languages, making it one of the most multilingual open models available.

Sources & further reading: