Anthropic’s New Opus 4.5 Becomes The First AI to Beat Software Engineers

Anthropic has introduced Opus 4.5, the newest version of its flagship AI model and the final release in the 4.5 series. The launch follows the rollout of Sonnet 4.5 in September and Haiku 4.5 in October.

Performance Upgrades

Opus 4.5 delivers state-of-the-art performance across major benchmarks in coding, tool use, and advanced reasoning. It leads on SWE-Bench, Terminal-bench, tau2-bench, MCP Atlas, ARC-AGI 2, and GPQA Diamond. The model is also the first to score above 80% on SWE-Bench, a key metric for coding reliability.

Ad Powered By Advergic
  Loading ad . . . 
 Ad - Continue scrolling to read

This means Opus 4.5 is fully equipped to beat software engineers in coding.

Computer Use and Spreadsheet Tools

Anthropic highlighted improvements in computer interaction and spreadsheet handling. To support this, the company is expanding access to its Claude for Chrome and Claude for Excel tools. The Chrome extension will now be available to all Max users, while the Excel-focused model will roll out to Max, Team, and Enterprise users.

Memory and Long-Context Improvements

To boost long-context performance, Opus 4.5 includes changes to how it manages memory. Dianne Na Penn, Anthropic’s head of product management for research, said the model needed more than a larger context window to improve quality. She emphasized that identifying the right information to retain is critical.

These memory upgrades also make way for an “endless chat” feature for paying Claude users. Instead of stopping when the model reaches its context limit, Opus 4.5 will automatically compress older parts of the conversation without interrupting the user.

Agentic Use Cases

Many improvements target agent-based workflows, particularly scenarios where Opus acts as the main agent coordinating multiple Haiku-powered sub-agents. Penn said strong working memory is essential for navigating codebases, reviewing long documents, and managing backtracking during complex tasks.

Opus 4.5 enters a competitive landscape. It will compete directly with recently released flagship models, including OpenAI’s GPT 5.1, launched on November 12, and Google’s Gemini 3, released on November 18.