Internal data signals stronger unit economics: OpenAI's compute margin for paid users nears 70%
Fresh internal figures point to a clear trend: OpenAI's compute profit margin for paid users has climbed to roughly 70% in October, up from about 52% at the end of last year and around 35% in January 2024. For finance teams, that's a meaningful shift in gross profit on the portion of revenue tied directly to inference costs.
Compute profit margin here refers to the share of revenue left after deducting the costs of running AI models for paid users. In other words, it's the closest "unit margin" proxy you'll get for model-serving economics.
The trajectory at a glance
- End of last year: ~52%
- January 2024: ~35%
- October (most recent): ~70%
That's a sharp swing from cost pressure to meaningful expansion. It suggests better model efficiency, smarter routing, and improved hardware utilization are compounding at scale.
Context from peers
According to analysis cited by industry reporting, a separate exchange ran at roughly -90% compute margin last year. Projections suggest a move to about 53% by year-end, with a bullish case near 68% next year. That gap closing hints at fast operational learning across the sector, not just one player.
What likely drove OpenAI's margin expansion
- Higher utilization of inference capacity (batching, scheduling, and caching that reduce cost per request).
- Model efficiency gains (compression, routing to lighter models for routine tasks, and workload mix shift to cheaper paths).
- Pricing mix from enterprise and paid tiers that better align price with compute intensity.
- Incremental benefits from newer accelerators and networking that lower effective cost per token.
Individually, each lever helps. Together, they show up as cleaner unit economics without relying on aggressive price hikes.
Why this matters for finance teams
- Gross margin profile: Paid-inference margins approaching ~70% look closer to software benchmarks than many expected a year ago.
- Pricing resilience: Better unit costs give room to hold or refine pricing while keeping contribution margins healthy.
- Cash efficiency: Lower COGS per request can shorten payback on sales motions targeting AI add-ons and usage tiers.
- Forecast confidence: More stable per-unit costs reduce variance in usage-driven forecasts.
Questions to ask your AI vendors
- What is your current compute margin on paid workloads, and how has it trended over the last 4 quarters?
- How do you route traffic between heavy and light models to control cost per request?
- What utilization and batching metrics do you track, and how do they translate to COGS per 1,000 tokens?
- What's your plan for next-gen hardware and network upgrades that materially change unit cost?
- How will pricing adapt as efficiency improves-discounts, bundles, or tiered usage thresholds?
Practical checklist for your model P&L
- Track COGS per 1,000 tokens, average tokens per request, and mix by model tier.
- Measure compute margin by product and customer cohort; compare contract pricing to actual inference intensity.
- Stress test margins with 10-20% swings in token usage and model selection.
- Align sales incentives with efficient usage patterns (not just volume).
- Evaluate vendor redundancy and routing to avoid single-supplier cost spikes.
Where to learn more
For sector context and reporting on AI unit economics, see coverage from The Information. If you're building an internal enablement plan for finance teams working with AI tooling, these curated resources can help: AI tools for Finance.
Disclaimer: This article reflects analysis based on reported figures and is for informational purposes only. It does not constitute investment advice or a recommendation to buy or sell any security.
Your membership also unlocks: