OpenAI Inc.

06/24/2026 | Press release | Distributed by Public on 06/24/2026 07:58

OpenAI and Broadcom unveil LLM-optimized inference chip

  • Early testing shows that the first-generation accelerator will deliver performance per watt substantially better than current state-of-the-art
  • Built from the ground up for current and future LLMs across the industry
  • Developed from design to production in nine months, accelerated by OpenAI's models
  • Expands OpenAI's full-stack platform, from products to models and now to chips
  • To be deployed at gigawatt scale with data center partners, over multiple generations

OpenAI and Broadcom (NASDAQ: AVGO) today unveiled Jalapeño, OpenAI's first Intelligence Processor: an accelerator architected around OpenAI's vision for the future of LLM inference, and the first AI accelerator in a multi-generation compute platform the companies are building together to make advanced AI faster, more reliable, and more accessible to more people.

Jalapeño was delivered to OpenAI CEO Sam Altman and President Greg Brockman by Broadcom President and CEO Hock Tan and President Charlie Kawwas, marking an important step in OpenAI's strategy to build the full stack behind its models and products.

OpenAI designed the chip from scratch around its deep understanding of LLM fundamentals, informed by its roadmap of models, kernels, serving systems, and product needs, with partners Broadcom and Celestica, helping industrialize the platform through chip implementation, board, rack system integration, high-performance networking, and scalable production systems. Jalapeño is designed with flexibility to work with all LLMs guided by OpenAI's insights into the inference needs of current and future AI models across the industry. Engineering samples of the Jalapeño chip are running ML workloads in the lab at production target frequency and power, including GPT-5.3-Codex-Spark.

While OpenAI is still measuring final performance, early testing shows that Jalapeño will deliver performance per watt substantially better than current state-of-the-art. A detailed technical report on performance will be presented in the coming months. The architecture reduces data movement and balances compute, memory, and networking resources to achieve realized utilization much closer to theoretical peak performance. Broadcom's silicon implementation and networking technologies, including Tomahawk networking silicon, help bring the platform to large-scale production.

"The world is moving to a compute-powered economy," said Greg Brockman, President and Co-Founder of OpenAI. "Jalapeño is part of our long-term full-stack infrastructure strategy to make compute more abundant, resulting in AI which is faster, more reliable, more affordable for people and businesses, and can be used to solve more important problems. By designing more of the stack ourselves, we can serve more intelligence with greater efficiency and keep pushing advanced AI toward broader access."

"Jalapeño was designed from the ground up for LLM inference using detailed insights from our close collaboration with OpenAI researchers," said Richard Ho, who leads OpenAI's hardware program. "We optimized the architecture around the kernels, memory movement, networking, and serving patterns that matter most for frontier AI models. Based on early testing, Jalapeño will efficiently execute our most important workloads close to the hardware's theoretical limits."

"Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI," said Hock Tan, President and CEO, Broadcom. "This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt scale data centers with Microsoft and other partners beginning in 2026."

Designed to be the best inference platform for LLMs

Jalapeño is a blank-slate design for modern LLM inference, not a general-purpose accelerator adapted from earlier AI workloads. It is informed by the systems OpenAI runs every day across ChatGPT, Codex, the API, and future agentic products, while also being designed for current and future LLMs across the industry. The goal is to combine the power and throughput of today's leading AI accelerators with latency closer to the fastest specialized inference systems, making Jalapeño well suited for interactive LLM products at scale.

That is the full-stack advantage. OpenAI is not only developing frontier models or building products on top of them; it is designing the infrastructure underneath them: chip architecture, kernels, memory systems, networking, scheduling, deployment systems, and product experience. Because OpenAI operates across the stack, each layer can be optimized around the same goal: making its models faster, more reliable, and more affordable for users.

Jalapeño strengthens the flywheel behind OpenAI's progress. Better infrastructure drives compute efficiency. Greater compute efficiency enables better training and serving, ultimately powering more capable AI models. Better models become better products for people, developers, and businesses. Better products drive more usage, more customers, and more revenue, which lets OpenAI reinvest in the next generation of infrastructure. Over time, that cycle helps make intelligence more capable, more reliable, and less expensive for everyone.

Nine-month tape-out, accelerated by OpenAI models

Jalapeño was co-developed from initial design to manufacturing tape-out in just nine months, and the custom AI accelerator program represents what we believe to be the fastest ASIC development cycle ever achieved in high-performance advanced semiconductors. That speed reflects deep software-hardware co-development with OpenAI's engineering teams, Broadcom's silicon implementation expertise, and the use of OpenAI models to accelerate parts of the design and optimization process.

The same models served to users are helping improve the infrastructure used to run future models. If AI can help engineers design better chips faster, it can lower the cost of compute across the industry and help democratize access to advanced AI.

Building a multi-generation platform with partners

Jalapeño is the first step in a multi-generation compute platform designed for initial deployment by the end of 2026 and expanding in the years ahead, combining OpenAI-designed accelerators with Broadcom silicon implementation, networking, and connectivity technologies; and Celestica's board, rack, and system expertise.

Making advanced AI more broadly available

The point of this work is simple: inference is where AI reaches people. Every improvement in cost, speed, and reliability can show up as a faster ChatGPT answer, a Codex task that can take more steps with less waiting, an API product that is cheaper to build, or more dependable access when demand is high.

Democratizing AI means making advanced models available, dependable, and affordable enough for more people to use every day. Jalapeño helps OpenAI turn more of its infrastructure into useful intelligence for students, developers, small businesses, researchers, enterprises, and anyone trying to learn, create, or solve hard problems.

OpenAI Inc. published this content on June 24, 2026, and is solely responsible for the information contained herein. Distributed via Public Technologies (PUBT), unedited and unaltered, on June 24, 2026 at 13:59 UTC. If you believe the information included in the content is inaccurate or outdated and requires editing or removal, please contact us at [email protected]