menu
close

Cloud Giants Accelerate Custom AI Chip Deployment Race

Google has begun mass deployment of its TPU v6e chips for AI inference, which have become mainstream in the first half of 2025. Meanwhile, AWS is focused on scaling its Trainium v2 platform while developing several variants of Trainium v3 for 2026 production, with AWS expected to lead all U.S. cloud service providers in in-house AI chip shipments this year. This trend signals a significant shift as major cloud providers increasingly develop and deploy their own custom AI accelerators rather than relying solely on third-party solutions.
Cloud Giants Accelerate Custom AI Chip Deployment Race

Major cloud service providers are rapidly advancing their custom AI chip strategies, reshaping the competitive landscape of AI infrastructure.

As a company with a relatively high adoption rate of self-developed chips, Google has begun mass deployment of its AI inference-focused TPU v6e chips, which have become mainstream in the first half of 2025. TrendForce highlighted that Google's server growth has been mainly driven by sovereign cloud projects and fresh data center capacity in Southeast Asia. The TPU v6e, also known as Trillium, represents a significant advancement in Google's AI hardware portfolio, boasting a 4.7x peak compute performance increase per chip compared to TPU v5e, with doubled High Bandwidth Memory (HBM) capacity and bandwidth.

Amazon Web Services (AWS) is focused on scaling its in-house Trainium v2 platform while developing several variants of Trainium v3, scheduled for mass production in 2026. AWS is expected to lead all U.S. CSPs in in-house AI chip shipments this year, doubling its 2024 volumes. The AWS Trainium2 chip delivers up to 4x the performance of first-generation Trainium, with Trainium2-based Amazon EC2 Trn2 instances purpose-built for generative AI and optimized for training and deploying models with hundreds of billions to trillion+ parameters.

Oracle, compared to the other major CSPs, is more focused on purchasing AI servers and in-memory database (IMDB) servers. In 2025, Oracle plans to step up AI server infrastructure deployment and integrate its core cloud database services with AI applications. Oracle's co-founder Larry Ellison has highlighted the company's unique position due to the vast amount of enterprise data stored in its databases. The latest version of its database, Oracle 23ai, is specifically tailored to the needs of AI workloads and is "the only database that can make all customer data instantly available to all popular AI models while fully preserving customer privacy."

The trend toward custom AI chips represents a strategic pivot for cloud providers seeking to optimize performance while reducing costs and dependency on third-party vendors. These custom accelerators like AWS Trainium and Google's TPUs directly compete with NVIDIA's A100/H100 GPUs but differentiate themselves through seamless cloud integration, predictable pricing, and optimized infrastructure.

According to TrendForce's latest analysis, major North American CSPs remain the primary drivers of AI server market growth, with steady demand also being bolstered by tier-2 data centers and sovereign cloud projects in the Middle East and Europe. Despite geopolitical tensions and US export restrictions impacting the Chinese market, global AI server shipments are projected to grow 24.3% year-over-year. This robust growth underscores how AI is becoming central to cloud service offerings and driving significant infrastructure investments across the industry.

Source:

Latest News