OpenAI unveils Jalapeño, its first custom inference chip built with Broadcom

OpenAI has taken the wraps off the first custom chip in its history. Named Jalapeño, the processor was developed in partnership with Broadcom, with a declared goal: to better serve the inference workloads of large language models (LLMs). The announcement, still lacking technical specifics, nonetheless marks a turning point for the California-based company, which until now had relied almost exclusively on NVIDIA GPUs to power services like ChatGPT and its platform APIs.

Why a custom silicon for inference

LLM inference poses different requirements than training. While training demands brute compute power and blazing-fast interconnects, the response phase to prompts—which consumes the majority of resources over time—targets minimal latency, high throughput, and energy efficiency. General-purpose GPUs, flexible as they are, were not designed solely for the matrix operations and attention mechanisms typical of transformers. A custom chip, or ASIC, can instead integrate dedicated accelerators and streamline data flow, reducing the cost per generated token and overall power consumption.

This does not mean GPUs will disappear. Flexibility remains a crucial advantage when experimenting with different architectures or performing fine-tuning. But for stable, large-scale inference workloads like those OpenAI handles daily, a processor tailored to its own needs can translate into significant savings and greater control.

Broadcom, the industrial ally

The choice of Broadcom as a partner surprises no one who follows the semiconductor market. The American company is already the operational force behind Google’s TPUs, another high-profile example of custom silicon for artificial intelligence. Broadcom supplies the engineering expertise and manufacturing capacity (with support from foundries like TSMC), while the customer retains the intellectual property of the design. This model allows OpenAI to enter the hardware arena without building a chip design division from scratch.

For now, Jalapeño is destined exclusively for OpenAI’s internal systems. There is no indication of a future commercialization of the processor or licensing to third parties. However, the entry of a player of OpenAI’s stature into the custom chip world reinforces an already evident trend: the search for GPU alternatives to contain operational expenses and gain independence from the supply chain.

Hardware autonomy and implications for self-hosted setups

The news carries significance beyond OpenAI’s borders. For enterprises and research centers evaluating on-premise LLM deployments, energy efficiency and total cost of ownership (TCO) are decisive factors. Specialized chips promise to lower the economic barrier, making it more sustainable to run models even at a smaller scale, in private data centers or air-gapped configurations where privacy and data sovereignty requirements mandate everything stays on-site.

It is no coincidence that AI-RADAR delves into precisely these scenarios: the choice between flexible GPUs and specialized accelerators is a cornerstone of self-hosted deployment strategies. A chip like Jalapeño, should it ever inspire commercially available products, could redefine the balance between performance, power consumption, and cost, pushing more organizations toward local architectures.

Yet an important trade-off must be kept in mind: an ASIC is optimized for a narrow set of models and operations. Changing the model, updating the network architecture, or applying particular quantization techniques could require hardware modifications or a new chip. In a field where LLMs evolve rapidly, this rigidity should not be underestimated.

A piece of a larger puzzle

Jalapeño is just the latest piece in a redefinition of artificial intelligence infrastructure. From AWS (Trainium, Inferentia) to Meta (MTIA), and including Microsoft with rumors of its own custom chip, all major players are investing in dedicated hardware. For IT professionals designing AI architectures, keeping tabs on these developments is not an exercise in style: it means anticipating scenarios where the availability of affordable specialized silicon could unblock projects currently deemed too costly.

OpenAI, for its part, has not yet revealed Jalapeño’s performance nor when it will enter mass production. But the mere fact it chose to take the field with a proprietary chip says a lot about the market’s direction: the era of general-purpose hardware for AI is not waning, but it is entering a phase of coexistence with tailored solutions. And for those running workloads in-house, every alternative can make a difference.