Anthropic's Claude 4 Models Set New AI Coding Benchmark

Anthropic has launched Claude Opus 4 and Claude Sonnet 4, its most powerful AI models to date, on May 22, 2025. These hybrid reasoning models feature breakthrough capabilities in coding, extended task execution, and advanced memory functions. The release strengthens Anthropic's competitive position against OpenAI and Google, with Claude Opus 4 achieving industry-leading performance on key software engineering benchmarks.

Anthropic unveiled its next-generation AI models, Claude Opus 4 and Claude Sonnet 4, during its 'Code with Claude 2025' developer conference on May 22. These models represent the company's most significant technical advancement yet, particularly in software engineering and autonomous agent capabilities.

Claude Opus 4, positioned as "the world's best coding model," scored 72.5% on the SWE-bench coding benchmark, outperforming OpenAI's GPT-4.1 (54.6%) and Google's Gemini 2.5 Pro. In testing at Rakuten, Opus 4 demonstrated the ability to code autonomously for nearly seven hours—a dramatic leap beyond previous AI models' minutes-long attention spans.

Both models feature hybrid reasoning systems that allow for either near-instant responses or extended, step-by-step thinking. They can use multiple tools in parallel, including web search, and when given access to local files, can extract and store key information to build what Anthropic calls "tacit knowledge" over time.

Claude Sonnet 4, which improves upon Sonnet 3.7 released in February, delivers enhanced problem-solving capabilities and superior instruction following. It's available to all Claude users, including those on the free tier, while Opus 4 is restricted to Pro, Max, Team, and Enterprise plans.

The release comes amid rapid growth for Anthropic, with annualized revenue doubling to $2 billion in Q1 2025 and an eightfold increase in customers spending over $100,000 annually. The company recently secured a $2.5 billion credit line to fuel its AI development efforts.

Despite the technical achievements, Anthropic has implemented strict safety measures for Claude Opus 4, classifying it under its AI Safety Level 3 (ASL-3) protocol after internal testing revealed potential risks. Both models are available through Anthropic's API, Amazon Bedrock, and Google Cloud's Vertex AI, with pricing set at $15/$75 per million tokens for Opus 4 and $3/$15 for Sonnet 4.

Source:

Anthropic's Claude 4 Models Set New AI Coding Benchmark

Latest News

ByteDance's Doubao AI Now Offers Real-Time Video Assistance

Dell and NVIDIA Power AI Factories With Blackwell Chips

OnePlus Ditches Alert Slider for AI-Powered Plus Key

Secretary of Energy Chris Wright visits SLAC to explore groundbreaking innovations

German Tech Giants Unite for EU-Backed AI Gigafactory

US Prosecutors Probed Builder.ai Before $1.5B AI Startup Collapsed

Norway's $1.8 Trillion Fund Makes AI Non-Negotiable for Staff

OpenTools.ai Unveils AI News Hub for Tech Professionals

Google Expands AI Computer Control to Developers via Gemini

Google Enhances Gemini Models with Transparent Thought Summaries

Anthropic's Claude 4 Models Set New AI Coding Benchmark

Related Articles

ByteDance's Doubao AI Now Offers Real-Time Video Assistance

OpenTools.ai Unveils AI News Hub for Tech Professionals

Google Expands AI Computer Control to Developers via Gemini

Google Enhances Gemini Models with Transparent Thought Summaries

WordPress Forms AI Team to Unify Developer Ecosystem

Latest News

ByteDance's Doubao AI Now Offers Real-Time Video Assistance

Dell and NVIDIA Power AI Factories With Blackwell Chips

OnePlus Ditches Alert Slider for AI-Powered Plus Key

Secretary of Energy Chris Wright visits SLAC to explore groundbreaking innovations

German Tech Giants Unite for EU-Backed AI Gigafactory

US Prosecutors Probed Builder.ai Before $1.5B AI Startup Collapsed

Norway's $1.8 Trillion Fund Makes AI Non-Negotiable for Staff

OpenTools.ai Unveils AI News Hub for Tech Professionals

Google Expands AI Computer Control to Developers via Gemini

Google Enhances Gemini Models with Transparent Thought Summaries