menu
close

Anthropic Unveils Claude 4: Setting New Benchmarks in AI Coding

Anthropic recently launched Claude Opus 4 and Claude Sonnet 4, establishing new standards for AI coding and reasoning capabilities. Claude Opus 4 leads industry benchmarks with 72.5% on SWE-bench, while Sonnet 4 delivers superior performance at a more accessible price point. Both models feature hybrid reasoning, allowing them to alternate between instant responses and extended thinking with tool integration, significantly enhancing their ability to handle complex, multi-step tasks.
Anthropic Unveils Claude 4: Setting New Benchmarks in AI Coding

Anthropic has officially released its next-generation AI models, Claude Opus 4 and Claude Sonnet 4, marking a significant advancement in artificial intelligence capabilities as of May 22, 2025.

Claude Opus 4, positioned as Anthropic's flagship model, has been dubbed "the world's best coding model" by the company. It leads on SWE-bench with a score of 72.5% and Terminal-bench at 43.2%, delivering sustained performance on long-running tasks that require focused effort across thousands of steps. The model can work continuously for several hours, dramatically outperforming previous Sonnet models and expanding what AI agents can accomplish.

Claude Sonnet 4 represents a significant upgrade to Claude Sonnet 3.7, delivering superior coding and reasoning while responding more precisely to instructions. Interestingly, it achieves 72.7% on SWE-bench, and with parallel test-time compute, reaches 80.2% accuracy—delivering better coding performance than the larger Opus 4 model. Anthropic describes it as balancing "performance and efficiency for internal and external use cases, with enhanced steerability for greater control over implementations."

Both models introduce powerful new capabilities, including extended thinking with tool use, allowing Claude to alternate between reasoning and tool use to improve responses. They can use tools in parallel, follow instructions more precisely, and—when given access to local files by developers—demonstrate significantly improved memory capabilities, extracting and saving key facts to maintain continuity and build tacit knowledge over time.

The models can extract and save facts in "memory" to handle tasks more reliably, building what Anthropic describes as "tacit knowledge" over time. Both Opus 4 and Sonnet 4 are "hybrid" models capable of near-instant responses and extended thinking for deeper reasoning. With reasoning mode switched on, they can take more time to consider possible solutions before answering, showing a "user-friendly" summary of their thought process.

Both models are available on the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI. Pricing remains consistent with previous Opus and Sonnet models: Opus 4 at $15/$75 per million tokens (input/output) and Sonnet 4 at $3/$15. For individual users, Anthropic offers tiered subscription plans. The free plan provides access to Claude Sonnet 4 with daily usage limits, while the Pro plan ($20/month or $200/year) offers approximately five times more usage than the free plan, access to both Claude 4 Sonnet and Claude 4 Opus via a model selector, and priority access during high traffic periods.

The launch of Claude 4 signals a new era in large language models. These offerings represent a leap in what's possible for enterprise, research, and creative applications with a 200,000-token context window, best-in-class coding and reasoning benchmarks, and a robust safety framework engineered for complex, high-stakes, and ever-changing real-world scenarios.

Source:

Latest News