Notion’s GPT‑5 rebuild unlocks autonomous AI workflows

OpenAI Blog Products

Summary

Notion rebuilt its AI system architecture with GPT-5 to enable autonomous AI agents that can reason, plan, and execute complete workflows across the platform. The new Notion 3.0 demonstrated 7.6% improvement over state-of-the-art models on user-aligned tasks and 100%+ improvement on multi-step structured tasks.

Notion rebuilt its AI architecture with GPT-5 to create agents that reason, act, and adapt across workflows, unlocking faster and more flexible productivity in Notion 3.0.
Original Article Export to Word Export to PDF
View Cached Full Text

Cached at: 04/20/26, 02:52 PM

# Notion’s GPT‑5 rebuild unlocks autonomous AI workflows Source: [https://openai.com/index/notion/](https://openai.com/index/notion/) OpenAINovember 7, 2025 By rebuilding their agent system with GPT‑5, Notion created an AI workspace that can reason, act, and adapt across workflows\. Company size:Mid\-market Region:North America Industry:Technology Products:API Results 7\.6% Improvement over state\-of\-the\-art models on outputs aligned with real user feedback In late 2022, within weeks of getting access to GPT‑4, Notion had already shipped a writing assistant, rolled out workspace\-wide Q&A features, and integrated OpenAI models deeply across its search, content, and planning tools\. But as models advanced \- and users began asking agents to complete entire workflows \- Notion’s team saw limits in their system architecture\. The old pattern of prompting models to do isolated tasks was limiting the ceiling of what was capable on their platform\. Agents needed to make decisions, orchestrate tools, and reason through ambiguity, and that shift required more than prompt engineering\. > “We didn’t want to retrofit the system\. We needed an architecture that actually supports how reasoning models work\.” Sarah Sachs, Head of AI Modeling at Notion Rebuilding for reasoning models, not retrofitting around them Instead of patching their existing stack, Notion rebuilt it\. They replaced task\-specific prompt chains with a central reasoning model that coordinates modular sub\-agents\. These agents can search across Notion, Slack, or the web; add to or edit databases; and synthesize responses using whatever tools the task requires\. With their launch of Notion 3\.0, AI isn’t just embedded in workflows; it can now run them\. Users assign a broad task \- for example, compiling stakeholder feedback \- and their agent plans, executes, and reports back\. The shift toward agents that choose how to work meant designing for model autonomy from the start\. Testing GPT‑5 with real product workloads To validate the architectural shift, Notion evaluated GPT‑5 against other state\-of\-the\-art models using actual user tasks\. Evaluations were grounded in feedback Notion had already marked as high priority, including questions that surfaced in Research Mode, long\-form tasks that required multi\-step reasoning, and ambiguous or outdated content where model judgment mattered\. The team used a combination of LLM\-as\-judge scoring, structured test fixtures, and human\-labeled feedback\. Key results: - 7\.6% improvement over state\-of\-the\-art models on outputs aligned with real user feedback - 15% better performance on difficult Research Mode questions - 100%\+ improvement on multi\-step, structured tasks like deadline updates and competitor research - Only model to fully saturate benchmarks with conflicting or outdated inputs These evaluations helped Notion identify where GPT‑5 added value \- for example, in reasoning, ambiguity, research \- and where environment\-specific tuning would improve results\. > “We didn’t cherry\-pick tasks\. These were high\-signal workflows from our product\.\.\.\.That’s where model differences actually show up\.” —Sarah Sachs, Head of AI Modeling at Notion ![A group of nine people sit and smile around a conference table in a bright office meeting room, some holding laptops and making peace signs. A large screen on the right shows a video call with three remote participants. Everyone looks relaxed and happy, suggesting a collaborative hybrid team meeting.](https://images.ctfassets.net/kftzwdyauwt9/5pRMngLODa02aEtv5nKJzP/3ce1507fd00c411c845a5009b314cc2c/Notion___OpenAI_team_photo.jpeg?w=3840&q=90&fm=webp) Lessons for teams building with GPT‑5 Notion’s rebuild wasn’t just about launching Notion 3\.0\. It was about designing a system that could support new model capabilities and adapt as those models get smarter\. Their approach offers a clear roadmap for other teams deploying agentic AI in production: - Evaluate what matters\. Use tasks your users actually do, not synthetic benchmarks\. - Test the hard stuff\. GPT‑5 shines when information is ambiguous, outdated, or multi\-step\. - Architect for autonomy\. If agents are making decisions, your system has to give them room to reason and tools to act\. - Clarity drives performance\. Even top models fall short without clean tool descriptions and good interface design\. - Rebuilding is better than patching\. If your system was built for completion models, it might not scale to agents\. > “We’re already seeing returns from the rebuild\.\.\.\.If the next model unlocks something new, we’ll do what it takes to support it\.” —Sarah Sachs, Head of AI Modeling at Notion ## Keep reading

Similar Articles

Introducing GPT-5

OpenAI Blog

OpenAI introduces GPT-5, a significant leap in AI intelligence featuring state-of-the-art performance across coding, math, writing, health, and visual perception. The unified system includes a smart efficient model, a deeper reasoning model (GPT-5 thinking), and a real-time router for optimal response selection.

Introducing GPT-5.2

OpenAI Blog

OpenAI introduces GPT-5.2, the most capable model series yet, with significant improvements in knowledge work, code generation, image perception, long-context understanding, and tool-calling. The GPT-5.2 Thinking variant achieves state-of-the-art performance on professional benchmarks, outperforming human experts on 70.9% of GDPval tasks across 44 occupations.

GPT-5 and the new era of work

OpenAI Blog

OpenAI announces GPT-5, their most advanced model yet, unifying capabilities from GPT-4o, o-series reasoning, agents, and advanced math, with immediate rollout to Team users and API access for developers. The release marks a major milestone with 700 million weekly ChatGPT users and 5 million paid business users already leveraging OpenAI's technology.

Introducing GPT-5.1 for developers

OpenAI Blog

OpenAI releases GPT-5.1, a new model in the GPT-5 series that dynamically adapts thinking time based on task complexity, offering 2-3x faster performance than GPT-5 while maintaining frontier intelligence. The release includes extended prompt caching (24-hour retention), new coding tools (apply_patch and shell), and a 'no reasoning' mode for latency-sensitive applications.

Introducing GPT-5.5

OpenAI Blog

OpenAI has released GPT-5.5, a significant upgrade to its frontier AI model, boasting superior capabilities in agentic coding, research, and multi-step task execution while maintaining efficiency and speed.