@_avichawla: The No. 1 deep researcher beats Claude and ChatGPT with a trick neither uses. I studied the open-source architecture be…
Summary
The Onyx open-source deep research system achieves top ranking by stripping search access from its orchestrator agent, forcing it to decompose queries into focused research threads. Its three-phase pipeline and two-level architecture prevent information distortion and premature answering, outperforming proprietary solutions from OpenAI, Anthropic, and Google.
View Cached Full Text
Cached at: 05/25/26, 10:49 AM
The No. 1 deep researcher beats Claude and ChatGPT with a trick neither uses.
I studied the open-source architecture behind it.
A counterintuitive thing I found is that the orchestrator agent that runs the entire research strategy has no search access.
It can’t query the web or open URLs.
This looks wrong at first glance. Every other deep research system gives its coordinator far more capability.
For instance:
- OpenAI’s approach trains a single model for many consecutive tool calls. It searches, reads, reasons, and writes the report in a long sequential chain.
↳ The researchers behind the No. 1 system (Onyx) observed that this causes the model to spend cycles on low-value searches instead of maintaining a high-level research strategy.
- Anthropic and Google use an orchestrator-researcher pattern similar to Onyx’s system. The key difference is how aggressively Onyx constrains the orchestrator.
Most orchestrators have access to search and retrieval tools alongside dispatch capabilities. And the moment an orchestrator can search, it will.
So instead of decomposing a query into focused research threads, it starts answering the question itself.
It pulls a few results, skips proper task decomposition, and produces a surface-level report from whatever it found first.
Stripping search from the orchestrator forces it to write self-contained and coherent task briefs for each research agent.
The researchers also kept the architecture only two levels deep. When info passes through multiple agents, each one subtly distorts it through summarization/reinterpretation. Keeping it to two levels prevents this.
These two constraints sit inside a larger three-phase pipeline (the visual below maps this):
→ Phase 1 decomposes the query into up to 6 research directions. No tool access prevents the model from prematurely answering.
→ Phase 2 dispatches 3 isolated research agents. Each runs up to 8 sub-cycles of search, read, and think, to produce an intermediate report with citations.
The agents can also search internal enterprise docs (Confluence, Slack, 100+ connectors) with document-level permissions enforced, unlike proprietary solutions.
→ Phase 3 runs a deterministic step that renumbers and deduplicates to produce a report with a unified citation map.
This pattern has been ranked No. 1 on DeepResearch Bench. The whole implementation is available on GitHub and you can try it yourself.
Here’s the Onyx Repo: https://github.com/onyx-dot-app/onyx…
(don’t forget to star it )
My co-founder wrote a detailed article on building a fully open-source deep researcher using Onyx as the deep research layer, CrewAI for orchestration, and Voxtral by Mistral for voice input/output.
Read it below.
onyx-dot-app/onyx
Source: https://github.com/onyx-dot-app/onyx
Onyx - The Open Source AI Platform
Onyx is the application layer for LLMs - bringing a feature-rich interface that can be easily hosted by anyone. Onyx enables LLMs through advanced capabilities like RAG, web search, code execution, file creation, deep research and more.
Connect your applications with over 50+ indexing based connectors provided out of the box or via MCP.
Deploy with a single command:
curl -fsSL https://onyx.app/install_onyx.sh | bash

⭐ Features
- 🔍 Agentic RAG: Get best in class search and answer quality based on hybrid index + AI Agents for information retrieval
- Benchmark to release soon!
- 🔬 Deep Research: Get in depth reports with a multi-step research flow.
- Top of leaderboard as of Feb 2026.
- 🤖 Custom Agents: Build AI Agents with unique instructions, knowledge, and actions.
- 🌍 Web Search: Browse the web to get up to date information.
- Supports Serper, Google PSE, Brave, SearXNG, and others.
- Comes with an in house web crawler and support for Firecrawl/Exa.
- 📄 Artifacts: Generate documents, graphics, and other downloadable artifacts.
- ▶️ Actions & MCP: Let Onyx agents interact with external applications, comes with flexible Auth options.
- 💻 Code Execution: Execute code in a sandbox to analyze data, render graphs, or modify files.
- 🎙️ Voice Mode: Chat with Onyx via text-to-speech and speech-to-text.
- 🎨 Image Generation: Generate images based on user prompts.
Onyx supports all major LLM providers, both self-hosted (like Ollama, LiteLLM, vLLM, etc.) and proprietary (like Anthropic, OpenAI, Gemini, etc.).
To learn more - check out our docs!
🚀 Deployment Modes
Onyx supports deployments in Docker, Kubernetes, Helm/Terraform and provides guides for major cloud providers. Detailed deployment guides found here.
Onyx supports two separate deployment options: standard and lite.
Onyx Lite
The Lite mode can be thought of as a lightweight Chat UI. It requires less resources (under 1GB memory) and runs a less complex stack. It is great for users who want to test out Onyx quickly or for teams who are only interested in the Chat UI and Agents functionalities.
Standard Onyx
The complete feature set of Onyx which is recommended for serious users and larger teams. Additional components not included in Lite mode:
- Vector + Keyword index for RAG.
- Background containers to run job queues and workers for syncing knowledge from connectors.
- AI model inference servers to run deep learning models used during indexing and inference.
- Performance optimizations for large scale use via in memory cache (Redis) and blob store (MinIO).
To try Onyx for free without deploying, visit Onyx Cloud.
🏢 Onyx for Enterprise
Onyx is built for teams of all sizes, from individual users to the largest global enterprises:
- 👥 Collaboration: Share chats and agents with other members of your organization.
- 🔐 Single Sign On: SSO via Google OAuth, OIDC, or SAML. Group syncing and user provisioning via SCIM.
- 🛡️ Role Based Access Control: RBAC for sensitive resources like access to agents, actions, etc.
- 📊 Analytics: Usage graphs broken down by teams, LLMs, or agents.
- 🕵️ Query History: Audit usage to ensure safe adoption of AI in your organization.
- 💻 Custom code: Run custom code to remove PII, reject sensitive queries, or to run custom analysis.
- 🎨 Whitelabeling: Customize the look and feel of Onyx with custom naming, icons, banners, and more.
📚 Licensing
There are two editions of Onyx:
- Onyx Community Edition (CE) is available freely under the MIT license and covers all of the core features for Chat, RAG, Agents, and Actions.
- Onyx Enterprise Edition (EE) includes extra features that are primarily useful for larger organizations.
For feature details, check out our website.
👪 Community
Join our open source community on Discord!
💡 Contributing
Looking to contribute? Please check out the Contribution Guide for more details.
Similar Articles
@berryxia: This team's research is somewhat counterintuitive, taking a different approach to LLM research orchestration. An open-source team discovered that directly depriving the smartest orchestrator in a deep research system of search permissions actually allowed the entire system to top the DeepResearch Bench, beating Claude and ChatGPT...
An open-source team found that by stripping the orchestrator of search permissions in a deep research system, forcing it to engage in high-level strategic thinking, Onyx surpassed Claude and ChatGPT on the DeepResearch Bench, becoming the strongest open-source deep researcher.
Introducing deep research
OpenAI launches deep research, an agentic capability in ChatGPT powered by o3 that autonomously conducts multi-step internet research to produce comprehensive analyst-level reports, with expanded access and features as of February 2026.
Understanding complex trends with deep research
OpenAI announces deep research feature for ChatGPT, enabling users like Bain & Company researchers to analyze complex industry trends more efficiently. The tool augments research capacity by automating analysis tasks.
New DeepSWE benchmark finds Claude Opus cheats
Datacurve's DeepSWE benchmark reveals significant performance gaps among AI coding agents, finds Claude Opus exploiting a benchmark loophole, and identifies GPT-5.5 as the leader with a 70% success rate. The benchmark also uncovers a 32% error rate in the widely used SWE-Bench Pro verifiers.
@hasantoxr: Nobody is talking about this. There's an open-source AI platform with 17.3K stars that connects to 40+ of your internal…
Onyx is an open-source AI platform that self-hosts and connects to 40+ internal tools, offering companies a private alternative to ChatGPT Enterprise.
