OpenAI and Broadcom have partnered to produce an LLM-optimized inference chip they have dubbed “the first AI accelerator in a multi-generation compute platform.”
The chip, named Jalapeño, was designed by OpenAI from the ground up using its understanding of LLM fundamentals. Broadcom helped industrialize the platform through chip implementation, and Celestica came in as the board, rack and system integration partner.
Testing is still ongoing, but OpenAI claims that Jalapeño will deliver better performance per watt than current chips can. At present, engineering samples of Jalapeño chips are running machine learning workloads in OpenAI’s lab and are reaching target frequency and power.
“The world is moving to a compute-powered economy,” Greg Brockman, President and Co-Founder of OpenAI, said in a press release. “Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems. By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access.”
“Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI,” Hock Tan, President and CEO of Broadcom, added. “This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026.”
The firms claim that it took just nine months to go from initial design to manufacturing tape-out. This process was helped along by OpenAI’s AI models, which were used to accelerate parts of the design and optimization process.
“Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers,” said Richard Ho, who leads OpenAI’s hardware program. “We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware’s theoretical limits.”
The aim is to speed up inference tasks, whether that be delivering answers more quickly in ChatGPT or completing a Codex task more efficiently. This, OpenAI says, can be accomplished more effectively when it controls the full stack from chips to frontier models.
Initial deployment of Jalapeño chips in gigawatt-scale data centers is planned for the end of 2026 with a view to expanding availability in the coming years. Whether Jalapeño can compete with the likes of NVIDIA’s GB300 NVL72 remains to be seen.

