OpenAI teams with Broadcom to design custom AI chips, aiming for 10GW of compute

OpenAI teams with Broadcom on custom AI chips, targeting 10 GW of compute, with first racks due late next year. Boosts control, may cut costs, but timing and funding are unclear.

OpenAI partners with Broadcom to design custom AI chips and deploy 10 GW of compute

OpenAI is teaming up with Broadcom to design custom AI accelerators, with deployment of new racks slated for late next year. The move adds vertical control to OpenAI's growing stack and sits alongside supply deals with Nvidia and AMD, plus data center partnerships with Oracle and CoreWeave.

The strategy concentrates more compute, more control, and potentially lower unit costs under one roof. It also raises questions about timing, compatibility, and the financial structure behind these deals.

What's new

Custom chips: OpenAI and Broadcom are co-developing AI accelerators begun about 18 months ago.
Scale: Sam Altman says the partnership targets 10 gigawatts of computing capacity, calling it "a gigantic amount of computing infrastructure to serve the needs of the world to use advanced intelligence."
Timeline: First racks are expected to land late next year.
Ecosystem: OpenAI still relies on Nvidia and AMD for specialized chips, and on Oracle, CoreWeave, and others for data centers.
Market reaction: Broadcom shares jumped more than 9% on the news. Broadcom also works with Amazon and Google.
Financial tension: Analyst Gil Luria said, "What's real about this announcement is OpenAI's intention of having its own custom chips. The rest is fantastical… approaching $1 trillion of commitments, and it's a company that only has $15 billion of revenue."
Scale ambition: Broadcom CEO Hock Tan said OpenAI needs more capacity as it moves toward "a better and better frontier model and towards superintelligence." He added, "If you do your own chips, you control your destiny."
Adoption: OpenAI says its products now have more than 800 million weekly users and is not yet profitable.

Why it matters for IT, engineering, and product

Supply security: Custom silicon reduces exposure to GPU shortages and pricing swings.
Cost/performance: Purpose-built chips can drive better throughput and lower inference costs, if software stacks align.
Roadmap control: Direct say over memory, interconnects, and power targets can speed feature delivery for new model classes.
E2E optimization: Co-design across model, runtime, and hardware can cut latency and increase utilization.

Practical moves to consider in the next 12 months

Plan for heterogeneity: Expect mixed fleets (Nvidia, AMD, custom accelerators). Keep models portable via standard formats and containerized runtimes.
Benchmark early: Build a repeatable suite (token throughput, latency, cost per 1K tokens, energy per query) to compare vendors.
Contract for flexibility: Negotiate burst capacity, instance portability, and Egress terms with cloud and GPU providers.
Energy and cooling: 10 GW-scale ambitions signal higher density. Verify your colocation or on-prem readiness for power and thermal limits.
Risk register: Track schedule slip, software stack maturity, and interop with your current model servers and vector DBs.
Upskill the team: Prioritize courses on MLOps, model optimization, and inference at scale. See curated options by company and job role at Complete AI Training and Courses by Job.

Key risks to watch

Bubble concerns: Circular financing and multi-billion-dollar commitments could compress margins and raise counterparty risk.
Timeline risk: New silicon often slips; plan contingencies for training and inference capacity.
Compatibility: Tooling for kernels, compilers, and serving stacks may lag; budget time for migration.
Supply chain: Substrate, packaging, and HBM constraints can bottleneck ramp.
Regulatory and energy: Power availability and policy shifts can delay deployments at gigawatt scale.

What this could mean for your roadmap

If the custom chips land on time, training cycles for frontier models may shorten and inference costs may drop. Expect stronger throughput for large-context workloads and possibly new model classes that rely on tighter memory and interconnect design.

For product teams, this can enable faster iteration on AI features, more reliable capacity during launches, and pricing that supports broader rollout to users. For engineering, assume a world with multiple backends and design abstractions accordingly.