OpenAI Launches GPT-5.4 With Native Computer Use, Challenging AI Rivals

OpenAI Accelerates Frontier AI Race With GPT-5.4 Launch

Just two days after introducing GPT-5.3 Instant, OpenAI has unveiled GPT-5.4, positioning it as its "most capable and efficient frontier model for professional work." This rapid-fire release underscores the intense competitive pressure in the generative AI market, where rivals like Anthropic's Claude Opus 4.6 and Google's Gemini 3.1 Pro are vying for dominance. The launch also arrives during a period of turbulence for OpenAI, marked by controversy over a U.S. Department of Defense deal and user cancellations.

The new model introduces several consequential advancements, most notably native computer-use capabilities, a massive 1-million-token context window via the API, and a reworked tool-calling system. Available in two specialized variants—GPT-5.4 Thinking and GPT-5.4 Pro—the model is designed to tackle complex, multi-step professional tasks.

Native Computer Use: A Leap Toward Autonomous Agents

The most significant feature of GPT-5.4 is its built-in ability to operate a computer. OpenAI describes this as its first general-purpose model released with "native, state-of-the-art computer-use capabilities" in its Codex development tool and API. This enables AI agents to carry out workflows across applications by writing code (using libraries like Playwright) or issuing direct mouse and keyboard commands in response to screenshots.

This capability moves beyond simple UI wrappers, representing a foundational step toward the agentic future that AI companies envision. It allows for automated, long-horizon tasks such as data gathering, analysis, and report generation without constant human intervention. OpenAI claims benchmark records in computer-use evaluations like OSWorld-Verified and WebArena Verified.

Enhanced Accuracy and Professional Performance

OpenAI is marketing GPT-5.4 as its "most factual model yet." The company reports that, on a dataset of de-identified prompts where users previously flagged errors, individual claims are 33% less likely to be false and full responses are 18% less likely to contain any errors compared to GPT-5.2.

Early tester feedback highlights substantial gains in professional domains. Daniel Swiecki of Walleye Capital reported a 30-percentage-point improvement in accuracy on internal finance and Excel evaluations, linking it to expanded automation. The model scored a record 83% on OpenAI's GDPval benchmark, which tests performance on real-world tasks across 44 occupations, outperforming office workers most of the time.

continue reading below...

Technical Specifications and Model Variants

GPT-5.4 comes in two primary flavors, each tailored for different use cases and tiers of OpenAI's service plans.

GPT-5.4 Thinking: Designed as a reasoning model, it will be available to all paid ChatGPT subscribers (Plus plan and above).
GPT-5.4 Pro: Optimized for high-performance, complex tasks, it is reserved for ChatGPT Pro ($200/month) and Enterprise plan users.

Both variants will be available in OpenAI's API and Codex application. The API version supports the groundbreaking 1M token context window, the largest OpenAI has offered, and features improved token efficiency, solving problems with fewer tokens than its predecessor. A new Tool Search system allows the model to look up tool definitions as needed, speeding up requests and reducing costs in systems with many available tools.

Targeting the Enterprise: Financial Services and Spreadsheets

A major focus of this release is the professional workspace. OpenAI is debuting OpenAI for Financial Services, a suite that includes a version of ChatGPT that runs directly inside Microsoft Excel and Google Sheets. This is bolstered by partnerships with data providers like FactSet, MSCI, Third Bridge, and Moody's.

The model's ability to "more persistently search across multiple rounds to identify the most relevant sources" makes it particularly suited for complex financial analysis and legal tasks. Mercor CEO Brendan Foody stated GPT-5.4 excels at creating "long-horizon deliverables such as slide decks, financial models, and legal analysis," delivering top performance while being faster and cheaper than rival frontier models.

Market Context and Competitive Landscape

GPT-5.4 enters a fiercely competitive arena. While it claims leadership in desktop computer use and professional knowledge work, the landscape remains fragmented. Anthropic's Claude Opus 4.6 still leads on several coding benchmarks, and Google's Gemini 3.1 Pro holds advantages in abstract reasoning and offers a large context window at a lower price.

OpenAI's decision to benchmark against GPT-5.2, rather than the very recent GPT-5.3, is a notable pattern that provides context for its headline performance figures. The company's publication of its chain-of-thought evaluation methodology as open source is a step toward greater external scrutiny and transparency.

Why This Launch Matters

The introduction of GPT-5.4 is more than a routine model update. Its native computer-use capability represents a paradigm shift, moving AI from a conversational tool to an active, automated workforce. This has profound implications for productivity software, business process automation, and the future of white-collar work.

By deeply integrating with financial data providers and spreadsheet applications, OpenAI is making a direct play for the lucrative enterprise market, aiming to become an indispensable tool for knowledge workers. The speed of this release, coming just days after GPT-5.3, signals OpenAI's commitment to maintaining its perceived technological lead in a market where yesterday's breakthrough is today's table stake.