Future TechnologyFuture Technology
AI

The man who helped invent every AI chatbot just joined OpenAI

· 4 min read · By Nath Connell

Key takeaways

  • Noam Shazeer co-authored the 2017 paper behind every major AI model, then built the efficiency tricks that make them affordable to run
  • Google paid $2.7 billion to bring him back from his own startup in 2024. He stayed less than two years before moving to OpenAI
  • His new title at OpenAI is Lead for Architecture Research, putting him in charge of how the next generation of models are built
  • The move gives OpenAI an edge heading into a planned IPO, where cheaper inference costs translate directly into better margins
Abstract visualisation of neural network connections, the architecture Shazeer helped design
Abstract visualisation of neural network connections, the architecture Shazeer helped design

On June 18, Noam Shazeer posted a departure notice on X. "I'm excited to share that I'll be joining OpenAI," he wrote. Google confirmed the departure. That short post reshuffled the talent landscape in a way that does not happen often.

Shazeer is one of eight researchers who co-authored "Attention Is All You Need," the 2017 paper that introduced the Transformer architecture. If you have used ChatGPT, Gemini, or Claude, you have used a system built on that paper. It is not an exaggeration to say he helped design the foundation that the entire current generation of AI runs on.

What the Transformer paper actually did

Before 2017, the dominant approach to language AI used recurrent neural networks, which processed text one word at a time in sequence. That created two problems: training took forever, and the model forgot what it read early in a long sentence by the time it reached the end.

The Transformer solved both. Every word in a sentence is processed simultaneously, with each word being compared against every other word through a mechanism called self-attention. Training became parallelisable across thousands of processors at once. That is why models went from processing sentences to processing whole books, and why training runs that once took months now take weeks.

The innovations nobody talks about

The Transformer co-authorship is what makes Shazeer famous. What makes him strategically important is what he built after it.

The central cost problem with large AI models is this: a model with more parameters is generally more capable, but it is also more expensive to run. You pay for every token it generates. Shazeer spent years solving that problem.

His Sparse Mixture of Experts architecture lets a model have hundreds of billions of parameters while only activating a small fraction of them per query. Think of it as a model that has many specialised sections, and a smart router that sends each input to only the sections it needs. Gemini and GPT-4 both use versions of this approach.

His Multi-Query Attention reduces the memory required to serve responses at scale, which lets the same hardware handle more users at once. That is directly bankable: more users per server means lower cost per query.

Both innovations are now in the hands of OpenAI, his new employer.

The future, in 3 minutes a day. The biggest tech story explained every morning, free. Get the briefing →

Why Google paid $2.7 billion and still lost him

Shazeer left Google in 2021, frustrated that the company would not release LaMDA publicly. He co-founded Character.AI, a chatbot platform that attracted tens of millions of users before ChatGPT arrived.

Google's response, in August 2024, was to spend roughly $2.7 billion to license Character.AI's technology and bring Shazeer back as VP of Engineering, co-leading the Gemini model family. He was credited with meaningful improvements to Gemini during that period.

He lasted under two years before OpenAI made its offer.

His title at OpenAI is Lead for Architecture Research. That puts him in charge of designing the next generation of models the company will build, at the exact moment OpenAI is preparing for an IPO it is targeting at up to $1 trillion in valuation. Cheaper, more efficient architecture is not just a technical goal at those numbers. It is the margin.

For Google, losing the person most credited with closing the gap against OpenAI in Gemini, less than two years after paying billions specifically to secure him, is a structural problem. Gemini has improved. Its lead architect just went to work for the competition.

The bigger picture

This is part of a pattern. Andrej Karpathy, an OpenAI founding member, joined Anthropic in May 2026 to lead pre-training research. Meta has spent over a year trying to poach from both companies. The number of people capable of original, production-grade AI architecture research is genuinely small. Labs are now treating individual researcher moves as strategic events.

The effects will not show up immediately. Training runs at frontier scale take months. But the institutional knowledge Shazeer carries about what works at scale and where the next efficiency gains are is not in any paper. It moved to a different building on June 18.

The future, in 3 minutes

The biggest tech story of the day, explained clearly. Free, every morning.

No spam. Unsubscribe any time.