GPT-5.5 ("Spud")
OpenAI's flagship model, released 23 April 2026 — its biggest step yet for agentic coding, computer use and deep research.
GPT-5.5 specs at a glance
| Maker | OpenAI |
| Codename | Spud |
| Released | 23 April 2026 |
| Model type | Unified multimodal — first full retrain since GPT-4.5 |
| Modalities | Text, images, audio, video (end-to-end in one architecture) |
| Context window | 1,000,000 tokens (ChatGPT & API) · 400,000 tokens (Codex CLI) |
| Variants | GPT-5.5 Thinking · GPT-5.5 Pro |
| API pricing | $5 / $30 per 1M input / output tokens (Thinking) |
| Best at | Agentic coding, computer use, deep research |
| Access | ChatGPT & Codex — Plus, Pro, Business, Enterprise |
What's new in GPT-5.5
GPT-5.5 is OpenAI's headline release for the first half of 2026. The focus is squarely on agentic work — the model is markedly better at coding, at operating a computer through a sequence of steps, and at carrying out longer research tasks without losing the thread.
What sets this release apart from a normal point upgrade is the architecture. GPT-5.5 is the first base model OpenAI has fully retrained since GPT-4.5: a rewritten architecture, a fresh pretraining corpus, and a training objective oriented around agentic behaviour rather than single-turn chat. OpenAI's Greg Brockman described it as "a new class of intelligence" and "a big step toward more agentic and intuitive computing."
It also unifies modalities. GPT-5.5 processes text, images, audio and video inside one architecture, handling every modality end-to-end rather than routing each one through a separate sub-model. The pitch is reliability across multi-step jobs: writing and debugging code, navigating tools, and producing knowledge work that holds together from start to finish.
GPT-5.5 benchmarks
GPT-5.5 posts the strongest agentic-coding scores OpenAI has shipped to date. The numbers below are from OpenAI's release notes and independent benchmark trackers.
| Benchmark | GPT-5.5 | What it measures |
|---|---|---|
| Terminal-Bench 2.0 | 82.7% | Command-line / agentic task completion |
| GDPval | 84.9% | Economically valuable knowledge work |
| Expert-SWE | 73.1% | Expert software-engineering tasks (up from 68.5% on GPT-5.4) |
| SWE-bench Pro | 58.6% | Real-world multi-file engineering |
| FrontierMath Tier 4 | 35.4% | Hardest research-level maths (39.6% on GPT-5.5 Pro) |
Key takeaway
GPT-5.5's gains are concentrated in agentic coding and computer use. Its 82.7% on Terminal-Bench 2.0 is a wide margin over rival flagships, and Expert-SWE rose nearly 5 points in a single release cycle.
Pricing and variants
GPT-5.5 ships in two API variants. Pricing roughly doubled versus GPT-5.4 — a notable move in a market where most competitors are cutting prices.
| Variant | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| GPT-5.5 (Thinking) | $5.00 | $30.00 |
| GPT-5.5 Pro | $30.00 | $180.00 |
For comparison, DeepSeek V4-Pro runs $1.74 / $3.48 per 1M tokens and Gemini 3.1 Pro is roughly $2 / $12 — so GPT-5.5 is the premium-priced option of the current frontier. The Pro variant is aimed at the hardest reasoning workloads, where its FrontierMath edge (39.6% vs 35.4%) can justify the cost.
Who should use GPT-5.5
GPT-5.5 is the strongest pick if your work looks like this:
- Developers and automation builders — agentic coding inside Codex is where GPT-5.5 has the clearest lead.
- Teams running long, multi-step tool workflows — the model is tuned to chain steps without drifting off task.
- Deep-research and analyst workflows — extended tasks that need the model to stay coherent over a long context.
It is less obviously the right call for cost-sensitive, high-volume jobs (where DeepSeek V4 or Gemini 3.1 Pro are far cheaper) or for media-heavy work that leans on a very large context window (where Gemini 3.1's 2M tokens win).
How GPT-5.5 compares
GPT-5.5's main rivals this cycle are Google's Gemini 3.1 Ultra and DeepSeek's V4 series. Each takes a different angle: Gemini leans into native multimodal processing and a 2M-token context, DeepSeek into open weights and low cost.
- GPT-5.5 vs Gemini 3.1 Ultra — full comparison →
- DeepSeek V4 vs GPT-5.5 — open weights vs flagship →
- Gemini 3.1 Ultra overview →
- DeepSeek V4 overview →
Limitations to keep in mind
GPT-5.5 is the most expensive frontier model to run by a wide margin — output tokens cost 10x more than DeepSeek V4-Pro. The Codex CLI context window is also capped at 400K tokens, lower than the 1M available in ChatGPT and the API, which matters for very large codebases. And like every model in this class, benchmark scores do not always translate cleanly to a specific real-world task — test it on your own workload before committing.
Frequently asked questions
When was GPT-5.5 released?
GPT-5.5, codenamed Spud, was released by OpenAI on 23 April 2026. It is the first base model OpenAI has fully retrained since GPT-4.5.
How much does GPT-5.5 cost?
GPT-5.5 (Thinking) costs $5 per million input tokens and $30 per million output tokens on the API — double the price of GPT-5.4. GPT-5.5 Pro costs $30 per million input tokens and $180 per million output tokens.
What is GPT-5.5's context window?
GPT-5.5 has a 1 million token context window in ChatGPT and the API. Inside the Codex CLI the context window is 400,000 tokens.
What is GPT-5.5 best at?
GPT-5.5 is built for agentic work: multi-step coding, computer use, and long-running research tasks. It scores 82.7% on Terminal-Bench 2.0 and 73.1% on Expert-SWE.
Is GPT-5.5 better than Gemini 3.1?
GPT-5.5 leads Gemini 3.1 on coding benchmarks — 58.6% vs 54.2% on SWE-bench Pro and 82.7% vs 68.5% on Terminal-Bench 2.0. Gemini 3.1 leads on context window size (2M tokens) and costs less per token.
Can I use GPT-5.5 for free?
At launch GPT-5.5 is available to OpenAI's paid subscribers — Plus, Pro, Business and Enterprise — in ChatGPT and Codex. Free-tier access typically follows a few weeks after release.