menu
close

Anthropic Unveils Claude 4: New Benchmark in AI Reasoning

On May 22, 2025, Anthropic released Claude 4, introducing two powerful models—Opus 4 and Sonnet 4—with unprecedented reasoning capabilities and multimodal processing. The new models excel at complex tasks including coding, where Opus 4 achieves industry-leading 72.5% on SWE-bench and can sustain performance for up to seven hours. Claude 4 introduces hybrid reasoning that allows for both instant responses and extended, step-by-step thinking with improved tool integration.
Anthropic Unveils Claude 4: New Benchmark in AI Reasoning

Anthropic has officially launched its most advanced AI system to date, marking a significant evolution in artificial intelligence capabilities. The Claude 4 family, released on May 22, 2025, consists of two models: Claude Opus 4 and Claude Sonnet 4, both designed to push the boundaries of what AI can accomplish.

Claude Opus 4, Anthropic's flagship model, has been positioned as the world's best coding AI, achieving a record-breaking 72.5% score on the SWE-bench Verified benchmark, outperforming competitors including OpenAI's models and Google's Gemini 2.5 Pro. During testing at Rakuten, Opus 4 demonstrated the remarkable ability to work autonomously on complex software engineering tasks for nearly seven hours without degradation in performance—a breakthrough that transforms AI from a quick-response tool into a genuine collaborator.

Sonnet 4, designed as a more cost-effective option, still delivers impressive capabilities with a 72.7% score on SWE-bench. Available to both free and paying users, it serves as a direct upgrade to Claude 3.7 Sonnet while maintaining the same pricing structure.

What distinguishes Claude 4 is its hybrid reasoning approach. Unlike previous models that generate immediate responses, Claude 4 can toggle between near-instant answers and extended thinking mode, where it works through problems step-by-step. This approach allows for more nuanced context processing and better handling of ambiguous instructions. The models can also use multiple tools in parallel, including web search, and alternate between reasoning and tool use to improve response quality.

Both models feature a 200K token context window and significantly improved memory capabilities. When given access to local files, they can extract and save key information to maintain continuity across complex tasks. This advancement enables Claude 4 to handle sophisticated workflows that previously required human intervention.

Anthropic has implemented enhanced safety measures for Claude 4, particularly for Opus 4, which is classified under the company's ASL-3 safety tier due to its advanced capabilities. These measures include strengthened harmful content detection and cybersecurity defenses.

The release comes amid fierce competition in the AI sector, with Anthropic aiming to grow its revenue from a projected $2.2 billion this year to $12 billion by 2027. Claude 4 is now available through Anthropic's web interface, API, Amazon Bedrock, and Google Cloud's Vertex AI, with Opus 4 priced at $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15.

Source:

Latest News