Noam Shazeer, Co-author of Transformer Architecture, Joins OpenAI as Head of Architecture Research

Noam Shazeer, a core co-author of the 2017 Transformer paper that underpins modern large language models, has joined OpenAI as head of architecture research after leaving Google, where he led Gemini development. His move represents a major talent acquisition for OpenAI and reflects intensifying competition among AI labs for researchers with deep expertise in model architecture as the field recognizes limitations in scaling existing Transformer designs.

Quick Facts

Who

Noam Shazeer

What

Noam Shazeer announced his departure from Google

When

2000: Shazeer joined Google as early employee

Where

Google

Noam Shazeer announced his departure from Google
Shazeer joined OpenAI as head of architecture research
Shazeer co-authored the 2017 Transformer paper
Shazeer founded Character.AI after leaving Google
Google signed technology licensing agreement with Character.AI

Noam Shazeer, one of the eight co-authors of the landmark 2017 Transformer paper "Attention Is All You Need," has announced his departure from Google to join OpenAI as head of architecture research. Shazeer's move marks a significant shift in the competitive landscape of AI research, as he transitions from leading Gemini model development at Google to joining OpenAI's research leadership team. Sam Altman stated that Shazeer had been among his most desired collaborators since OpenAI's founding, and that he had waited a decade for this opportunity.

Shazeer's career has been deeply intertwined with the evolution of modern AI. He joined Google in 2000 as an early employee, initially working on search and advertising systems including spell correction, ad ranking, and spam detection. His transition to deep learning research came around 2012 when he joined Google Brain, where he became instrumental in developing foundational technologies that would reshape the AI industry. Beyond Transformer, Shazeer contributed to the development of Mixture of Experts (MoE), Multi-Query Attention, and Adafactor—technologies that directly influence how large language models are trained and optimized today.

In 2021, Shazeer and colleague Daniel De Freitas departed Google to co-found Character.AI after becoming frustrated with Google's reluctance to publicly release Meena, a conversational AI they had developed. According to internal communications, Shazeer believed the technology could replace Google Search and generate enormous commercial value, yet Google declined to launch it due to safety and reputational concerns. Character.AI quickly gained traction following ChatGPT's breakthrough, achieving a $1 billion valuation in March 2023 with a $150 million Series A funding round led by Andreessen Horowitz.

However, Character.AI faced mounting challenges as the cost of inference scaled with its growing user base. In August 2024, Google negotiated a technology licensing agreement with Character.AI and brought Shazeer and other key researchers back to Google DeepMind. While the transaction was valued at approximately $2.7 billion, it effectively represented a high-cost recruitment of Shazeer to lead Gemini development alongside Jeff Dean and Oriol Vinyals. This latest departure after less than two years underscores the intense competition for top-tier AI talent in an era where architectural innovation has become critical to maintaining competitive advantage.

Shazeer's new role at OpenAI signals a strategic focus on architectural research at a pivotal moment in AI development. Industry consensus increasingly recognizes that simply scaling up existing Transformer-based models shows diminishing returns. Research highlighting Transformer's structural limitations—particularly in maintaining dynamic internal states, consistent multi-turn reasoning, and long-term memory—suggests that next-generation models will require fundamental architectural innovations beyond incremental improvements to existing designs.

Why This Matters

Shazeer's recruitment underscores a strategic pivot in AI development: the industry recognizes that scaling existing Transformer models faces diminishing returns, and breakthrough progress requires fundamental architectural innovation. For readers, this signals that next-generation AI capabilities will depend on novel designs addressing Transformer limitations in reasoning, memory, and state management—making architectural research talent a primary competitive battleground among AI labs.

Timeline & Sources

Jan 1, 2009

Wire

Shazeer briefly left Google

Jan 1, 2012

Wire

Shazeer rejoined Google and Google Brain team

Jan 1, 2020

Wire

Shazeer and Daniel De Freitas completed development of Meena conversational AI

Jan 1, 2021

Wire

Shazeer and De Freitas left Google to co-found Character.AI

Noam Shazeer, Co-author of Transformer Architecture, Joins OpenAI as Head of Architecture Research

Quick Facts

Topics

Why This Matters

Timeline & Sources

Entities

Sources