OpenAI and Broadcom release first custom AI inference chip

OpenAI and Broadcom released the Jalapeño AI chip to cut operating costs. Separately, WiMi posted a 235.9% profit surge to 347 million yuan.

Categorized in: AI News IT and Development
Published on: Jun 26, 2026
OpenAI and Broadcom release first custom AI inference chip

OpenAI and Broadcom jointly released their first custom AI inference chip, Jalapeño, on June 24, 2026. The ASIC, designed for large language model inference, aims to cut operating costs and reduce OpenAI's reliance on external GPU providers as competition in AI computing intensifies.

Jalapeño's design and initial performance

OpenAI handled the underlying architecture, while Broadcom managed silicon implementation and network hardware. Celestica, a Canadian electronics manufacturer, integrated the boards and rack systems. Engineering samples are running machine learning workloads at mass production frequencies and power levels, including GPT5.3, Codex, and Spark. OpenAI said the chip delivers better performance per watt than current state-of-the-art levels.

The move reflects pressure from high operating costs, market competition, and supply chain constraints. OpenAI previously relied on a dedicated Microsoft Azure cluster with tens of thousands of NVIDIA GPUs. With Jalapeño, the company is building a full-stack infrastructure strategy to deliver more efficient AI services.

NVIDIA's roadmap and the computing power landscape

At NVIDIA's 2026 annual shareholder meeting, held the same day, CEO Jensen Huang described the "era of useful AI" and said demand for AI factories continues to climb. "Every industry is vying to adopt Agentic AI," Huang said, "and the AI industry can be imagined as a five-layered cake comprising energy, chips and systems, infrastructure, models, and applications."

Huang outlined the product progression: Hopper for pre-training, Blackwell for rack-scale inference, and Vera Rubin - now in full production - designed for intelligent agents. Major cloud providers and model developers are preparing to build on Vera Rubin, he added. Google, with its custom TPU chips, and Anthropic, which has diversified its computing partnerships with Amazon and Google, are also building out specialized hardware strategies.

WiMi bets on AI chip clusters and edge computing

Chinese AI vision company WiMi has been building AI chip clusters and exploring low-power chips and edge computing optimization for embodied intelligence and multimodal models. Its 2025 annual report showed net profit of 347 million yuan, a 235.9% year-on-year surge, driven by quantum technology, AI, and holographic AR businesses. The company plans to continue opening computing resources and technical interfaces to accelerate commercialization.

Why this matters for IT and development

Custom inference chips like Jalapeño and the ramp-up of NVIDIA's Vera Rubin platform will reshape infrastructure choices for AI workloads. IT and development teams need to track ASIC performance benchmarks and evaluate how these chips could influence cloud pricing and on-premises inference deployments. Vera Rubin, now in full production, will become the standard for agentic AI, requiring updated skills in distributed computing and model optimization. Understanding these hardware shifts is critical for planning AI infrastructure, a topic covered in AI for IT & Development.


Get Daily AI News

Your membership also unlocks:

700+ AI Courses
700+ Certifications
Personalized AI Learning Plan
6500+ AI Tools (no Ads)
Daily AI News by job industry (no Ads)