research-agents

Tag

Cards List
#research-agents

@k_dense_ai: Introducing Science Superpowers — a complete computational-science methodology for AI research agents. It makes your ag…

X AI KOLs Timeline · 6d ago Cached

Science Superpowers is an open-source computational-science methodology for AI research agents, enforcing pre-registration and reproducible workflows to prevent p-hacking and HARKing.

0 favorites 0 likes
#research-agents

ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence

arXiv cs.AI · 2026-05-27 Cached

ScientistOne introduces Chain-of-Evidence, a verifiability framework for autonomous research agents that ensures every claim is traceable to evidence, achieving zero hallucinated references, perfect score verification, and the highest method-code alignment across 75 papers while matching or exceeding human expert performance on frontier research tasks.

0 favorites 0 likes
#research-agents

@_avichawla: The No. 1 deep researcher beats Claude and ChatGPT with a trick neither uses. I studied the open-source architecture be…

X AI KOLs Timeline · 2026-05-25 Cached

The Onyx open-source deep research system achieves top ranking by stripping search access from its orchestrator agent, forcing it to decompose queries into focused research threads. Its three-phase pipeline and two-level architecture prevent information distortion and premature answering, outperforming proprietary solutions from OpenAI, Anthropic, and Google.

0 favorites 0 likes
#research-agents

Product Integrations

Reddit r/AI_Agents · 2026-05-24

NineLayer, an MCP-based search engine for coding and research agents, has improved latency from 40s to 1.5s and is seeking user input on which platform integrations to prioritize.

0 favorites 0 likes
#research-agents

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

arXiv cs.CL · 2026-05-20 Cached

This paper introduces REFLECT, a meta-evaluation benchmark for assessing the reliability of LLM judges in evaluating deep research agents. Experiments show current LLM judges remain unreliable, with overall accuracies below 55% across reasoning, tool-use, and report-quality failures.

0 favorites 0 likes
← Back to home

Submit Feedback