June 24, 2026, (Inside AI) — OpenAI and Broadcom have unveiled Jalapeño, a custom AI accelerator chip purpose-built for large language model inference. The chip is the first in a multi-generation platform aimed at gigawatt-scale data center deployments with partners including Microsoft, starting later this year.
Engineering samples of Jalapeño are already running production workloads at target frequency and power, including GPT‑5.3‑Codex‑Spark. Early tests show performance per watt substantially better than current state-of-the-art, though final numbers are pending a detailed technical report.
The chip was co-developed in just nine months from design to tape-out, what the companies call the fastest ASIC cycle for high-performance semiconductors. It was handed to OpenAI CEO Sam Altman and President Greg Brockman by Broadcom CEO Hock Tan.
Jalapeño is not a general-purpose accelerator. It was architected from scratch around OpenAI’s deep understanding of LLM kernels, memory movement, networking, and serving patterns. The design reduces data movement and balances compute, memory, and networking to push realized utilization closer to theoretical peak performance.
Broadcom contributed silicon implementation and networking technologies, including Tomahawk networking silicon. Celestica handled board, rack, and system integration. The platform is designed to work with all LLMs, guided by OpenAI’s roadmap of current and future models.
OpenAI President Greg Brockman said:
"The world is moving to a compute-powered economy. Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems."
The chip targets interactive LLM products at scale, combining the throughput of leading accelerators with latency closer to specialized inference systems. It reflects a full-stack approach: OpenAI designs everything from chip architecture to product experience, optimizing each layer for speed, reliability, and cost.
Richard Ho, who leads OpenAI’s hardware program, said:
"Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers. We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models."
Broadcom CEO Hock Tan added:
"Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI. This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026."
The project strengthens a flywheel: better infrastructure drives efficiency, enabling better training and serving, which powers more capable models and products, driving usage and revenue that fund next-generation hardware. OpenAI also used its own models to accelerate chip design, hinting at a future where AI helps lower compute costs industry-wide.
Jalapeño is set for initial deployment by end of 2026, with expansion planned over multiple generations. The ultimate goal is to make advanced AI more accessible by turning infrastructure improvements into faster, cheaper, and more reliable services for users worldwide.