Enterprise AI Adoption Slows as Inferencing Costs Challenge Cloud Users
Broader adoption of AI by enterprises is hitting a roadblock, largely due to difficulties in predicting inferencing costs. Many cloud customers fear unexpectedly high bills from services like Microsoft Azure, AWS, and Google Cloud. According to Canalys, enterprises spent $90.9 billion globally on infrastructure and platform-as-a-service in the first quarter, marking a 21% year-on-year increase as cloud adoption continues to grow.
This growth is driven by companies moving more workloads to the cloud and experimenting with generative AI, which depends heavily on cloud infrastructure. Yet, as organizations shift from development and trials to actual deployment of AI models, the recurring costs of inferencing services are becoming a major concern.
Why Inferencing Costs Matter
Training an AI model is typically a one-time expense. Inferencing, the process of running AI models to generate outputs, incurs ongoing operational costs. This makes inferencing a critical factor for enterprises aiming to commercialize AI at scale.
Enterprises are now scrutinizing the cost-efficiency of inferencing by comparing different AI models, cloud platforms, and hardware architectures, such as GPUs versus custom accelerators. The goal is to find the most cost-effective way to deploy AI.
Pricing Complexity Creates Budget Uncertainty
Many AI services use usage-based pricing, charging per token or API call. This makes it difficult for businesses to forecast costs as usage scales up. When inferencing costs become unpredictable or too high, companies often limit AI use by reducing model complexity or restricting deployment to only high-value cases.
This cautious approach means the full potential of AI remains untapped for many organizations.
Cloud Bills Are a Real Concern
Enterprises’ hesitation to expand AI inferencing usage is understandable given past experiences with cloud bills exceeding expectations. Rapid growth in cloud usage or overprovisioning to avoid resource shortages can lead to unexpectedly large expenses. For example, 37signals, the company behind Basecamp, moved back to on-premises IT after facing a cloud bill exceeding $3 million annually.
Gartner warned that AI cost estimates can be off by 500% to 1,000%, often due to vendor price increases, insufficient cost monitoring, or misuse of AI resources.
Efforts to Improve Cost Efficiency
Cloud providers are working on modernizing infrastructure to improve inferencing efficiency and reduce AI service costs. However, some experts suggest that public clouds might not be the best environment for large-scale AI inferencing deployments.
At the Canalys Forum EMEA, chief analyst Alastair Edwards noted that scaling AI use cases in the public cloud could become financially unsustainable. Instead, some companies are exploring colocation or specialized hosting providers as alternatives to the major public cloud operators.
Market Share and Growth Trends
The big three cloud providers—AWS, Microsoft Azure, and Google Cloud—still dominate the infrastructure-as-a-service (IaaS) and platform-as-a-service (PaaS) markets, accounting for 65% of global customer spending. However, Microsoft and Google are growing faster than AWS, which saw its growth rate slow to 17%, while its competitors maintain growth rates above 30%.
What This Means for Enterprises
- Be cautious when estimating AI inferencing costs; prepare for variability and potentially high bills.
- Consider the trade-offs between cloud platforms and hardware options to find the most cost-effective setup.
- Evaluate alternatives to public cloud for AI inferencing, such as colocation or specialized hosting.
- Monitor usage closely to avoid unexpected expenses and optimize AI deployments.
For those looking to deepen their AI skills and better manage AI projects, exploring practical AI courses can provide valuable insights and tools.
Your membership also unlocks: