OpenAI Unveils GPT-5.5: A Smarter, More Efficient AI for Real Work

OpenAI's GPT-5.5: The Next Step in Agentic AI

On April 23, 2026, OpenAI unveiled GPT-5.5, positioning it as its "smartest and most intuitive to use model yet." This incremental but significant release focuses on enhancing AI's ability to handle complex, multi-step real-world tasks, marking a clear step towards OpenAI's vision of an AI "superapp." The model is now rolling out to Plus, Pro, Business, and Enterprise users in ChatGPT and Codex.

The announcement comes just weeks after Anthropic's release of Claude Opus 4.7 and its more powerful, restricted Claude Mythos Preview, underscoring the intense pace of competition in the frontier AI space. OpenAI's Chief Scientist, Jakub Pachocki, noted this rapid cadence, remarking to TechCrunch that "the last two years have been surprisingly slow."

A New Class of Intelligence for Practical Tasks

GPT-5.5 is engineered for practical application. OpenAI states it excels at writing and debugging code, researching online, analyzing data, creating documents and spreadsheets, operating software, and moving across tools to complete tasks. The core advancement is its improved ability to understand user intent and autonomously plan, use tools, check its work, and navigate ambiguity.

CEO Sam Altman expressed his personal approval on X, stating, "I personally like it." He also praised the inference team's work to serve the model efficiently. This focus on efficiency is a major theme of the release.

Performance and Benchmark Leadership

OpenAI's provided benchmarks show GPT-5.5 retaking the lead as the most powerful publicly available large language model (LLM) over Anthropic's Claude Opus 4.7 and Google's Gemini 3.1 Pro in most categories. However, Anthropic's unreleased Claude Mythos Preview still outperforms it in some areas, such as on "Humanity's Last Exam" without tools.

Coding: GPT-5.5 achieved 82.7% on Terminal-Bench 2.0 (complex command-line workflows), 58.6% on SWE-Bench Pro (real-world GitHub issue resolution), and 73.1% on an internal "Expert-SWE" benchmark for long-horizon tasks.
Knowledge Work: It scored 84.9% on GDPval (occupational task simulation) and 78.7% on OSWorld-Verified (autonomous computer operation).
Scientific Research: The model showed strong gains on GeneBench (25.0%) for multi-stage genetic analysis and BixBench (80.5%) for bioinformatics.
Mathematics: It scored 51.7% on FrontierMath Tier 1-3 and 35.4% on the more difficult Tier 4.

Importantly, OpenAI claims these improvements come alongside a significant reduction in the number of tokens required to complete tasks, making it more cost-effective for complex operations.

continue reading below...

The Efficiency Breakthrough: Matching Speed with Greater Power

One of the most technically notable claims is that GPT-5.5 delivers higher intelligence while matching the per-token latency of its predecessor, GPT-5.4, in real-world serving. This counters the typical trade-off where larger, more capable models are slower.

This feat was achieved through deep hardware-software co-design. The model was co-designed, trained, and served on NVIDIA's cutting-edge GB200 and GB300 NVL72 systems. OpenAI revealed that Codex and GPT-5.5 itself were instrumental in optimizing the inference stack.

As VentureBeat reported, Codex analyzed production traffic patterns and wrote custom heuristic algorithms for load balancing and partitioning work across GPU cores. This AI-assisted optimization reportedly increased token generation speeds by over 20%.

Real-World Impact: From Debugging to Discovery

Beyond benchmarks, early testers reported tangible performance leaps. Dan Shipper, CEO of Every, called it "the first coding model I've used that has serious conceptual clarity," noting it successfully debugged a complex post-launch issue where GPT-5.4 failed.

Pietro Schirano, CEO of MagicPath, said GPT-5.5 merged a branch with hundreds of changes in about 20 minutes, leading him to post that it gave him "my first taste of AGI." An engineer at NVIDIA reportedly stated, "Losing access to GPT-5.5 feels like I've had a limb amputated."

The model also demonstrated novel capabilities in scientific research. An internal version helped discover a new proof about Ramsey numbers in combinatorics, a technically difficult area. Researchers used it as a partner for critiquing manuscripts, stress-testing arguments, and building complex research tools from single prompts.

Enhanced Safeguards and the Cybersecurity Balance

With increased capability comes heightened risk. OpenAI evaluated GPT-5.5 under its Preparedness Framework, classifying its cybersecurity and biological/chemical capabilities as "High" risk—capable of amplifying existing pathways to severe harm—but not "Critical."

Mia Glaese, OpenAI's VP of Research, emphasized extensive third-party safeguard testing and red teaming. The company is deploying "its strongest set of safeguards to date," including tighter controls on high-risk cyber activity and a new "Trusted Access for Cyber" program. This program will offer verified defenders, like those protecting critical infrastructure, expanded access to cyber-permissive models like GPT-5.4-Cyber with fewer restrictions.

Availability, Pricing, and the Road Ahead

GPT-5.5 is now available to paying ChatGPT and Codex subscribers. A higher-tier "GPT-5.5 Pro" model is available for Pro, Business, and Enterprise ChatGPT users, designed for even more demanding tasks. API access is promised "very soon," with pricing set at $5 per 1M input tokens and $30 per 1M output tokens for the standard version.

While priced higher than GPT-5.4, OpenAI stresses that GPT-5.5's token efficiency means it often completes tasks using fewer tokens, potentially offering better value. The release solidifies OpenAI's strategy of frequent, iterative model updates and advances its co-founders' vision of a unified AI "superapp" that combines ChatGPT, Codex, and other tools into a single, powerful assistant for enterprise and professional work.