OpenAI built its own chip to stop paying Nvidia so much
Key takeaways
- OpenAI unveiled Jalapeno, its first custom-built inference chip, made with Broadcom to run its existing models more cheaply
- Inference, the cost of running a trained model, is where the recurring bill lives when you serve hundreds of millions of users
- Alongside it, SpaceX signed a $6.3 billion compute deal with Reflection AI; the scramble for capacity is industry-wide
- Custom silicon is how the biggest AI firms try to loosen Nvidia's grip on pricing
OpenAI has revealed its first custom-built chip. It is called Jalapeno, it was designed with Broadcom, and it has one job: run OpenAI's existing models more cheaply.
Training versus running
There are two big costs in AI. Training is building the model in the first place, a huge one-off bill. Inference is running it every time someone sends a prompt, and that bill never stops. When you serve hundreds of millions of users, inference is where the money quietly drains away, fraction of a cent by fraction of a cent.
Jalapeno is an inference chip. It is not meant to train new models from scratch. It is built to run pre-trained ones faster and at lower cost, which is exactly the bill OpenAI most needs to cut. Designing your own silicon for that narrow job lets you strip out everything a general-purpose chip carries that you do not need.
The real target is Nvidia's pricing power
Nvidia sells most of the chips the AI industry runs on, and it sets the price. For a company spending billions a year on compute, that pricing power is a problem you cannot negotiate your way out of. Building your own chip is the only real lever. Google did it with its TPUs, Amazon with Trainium and Inferentia, and now OpenAI is following the same road.
The chip arrived in a week where the appetite for compute was on full display. SpaceX signed a $6.3 billion infrastructure deal with the startup Reflection AI, which will pay $150 million a month from July 1 for access to Nvidia GB300 chips inside SpaceX's Colossus 2 data centre. Micron posted record results on memory demand, and SK Hynix moved to raise close to $30 billion to scale up production.
Read together, the message is simple. The companies at the front of the AI race have decided the bottleneck is no longer ideas. It is silicon, power and who controls the supply. OpenAI building its own chip is a bet that owning more of that stack is the difference between a sustainable business and one that bleeds money on every query.