StatefulDiscovery: Evidence-Calibrated Claim Formation in Open-Ended Scientific Discovery

arXiv cs.AI 06/11/26, 04:00 AM Papers

Summary

Introduces StatefulDiscovery, a framework for open-ended scientific discovery that uses externalized investigation state to calibrate evidence and claims, outperforming baselines in producing well-supported high-value claims.

arXiv:2606.11851v1 Announce Type: new Abstract: Open-ended scientific discovery asks agents to move beyond executing analyses for predefined questions. Across multiple rounds of exploration, a discovery agent must decide which phenomena warrant investigation while avoiding overinterpretation, where emerging claims exceed the evidential scope of the analyses supporting them. This creates an evidence-calibration problem: the exploration trajectory must be coupled with claim status so that evidence can guide both what to investigate next and what can be claimed. We introduce StatefulDiscovery, a discovery framework that externalizes investigation state and uses it to coordinate frontier selection, evidence acquisition, and claim adjudication. We evaluate StatefulDiscovery across 40 real-data discovery tasks. Compared with several baselines, StatefulDiscovery produces more claims overall judged to be both well-supported and high-value. Ablations indicate that structured hypotheses, local adjudication, and frontier control contribute to performance. Together, these results suggest that explicit discovery state can couple exploration with evidence-calibrated claim formation.

Original Article

View Cached Full Text

Cached at: 06/11/26, 01:49 PM

# StatefulDiscovery: Evidence-Calibrated Claim Formation in Open-Ended Scientific Discovery
Source: [https://arxiv.org/abs/2606.11851](https://arxiv.org/abs/2606.11851)
[View PDF](https://arxiv.org/pdf/2606.11851)

> Abstract:Open\-ended scientific discovery asks agents to move beyond executing analyses for predefined questions\. Across multiple rounds of exploration, a discovery agent must decide which phenomena warrant investigation while avoiding overinterpretation, where emerging claims exceed the evidential scope of the analyses supporting them\. This creates an evidence\-calibration problem: the exploration trajectory must be coupled with claim status so that evidence can guide both what to investigate next and what can be claimed\. We introduce StatefulDiscovery, a discovery framework that externalizes investigation state and uses it to coordinate frontier selection, evidence acquisition, and claim adjudication\. We evaluate StatefulDiscovery across 40 real\-data discovery tasks\. Compared with several baselines, StatefulDiscovery produces more claims overall judged to be both well\-supported and high\-value\. Ablations indicate that structured hypotheses, local adjudication, and frontier control contribute to performance\. Together, these results suggest that explicit discovery state can couple exploration with evidence\-calibrated claim formation\.

## Submission history

From: Jiayao Chen \[[view email](https://arxiv.org/show-email/b7af9ecb/2606.11851)\] **\[v1\]**Wed, 10 Jun 2026 09:28:28 UTC \(2,709 KB\)

StatefulDiscovery: Evidence-Calibrated Claim Formation in Open-Ended Scientific Discovery

Similar Articles

Formal Conjectures: An Open and Evolving Benchmark for Verified Discovery in Mathematics

LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs

ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence

When Should an AI Scientist Stop? Verifiable Experiment Steering and Refusal for Autonomous Discovery

Declarative Data Services: Structured Agentic Discovery for Composing Data Systems

Submit Feedback

Similar Articles

Formal Conjectures: An Open and Evolving Benchmark for Verified Discovery in Mathematics

LLM-AutoSciLab: Closed-Loop Scientific Discovery via Active Experimentation with LLMs

ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence

When Should an AI Scientist Stop? Verifiable Experiment Steering and Refusal for Autonomous Discovery

Declarative Data Services: Structured Agentic Discovery for Composing Data Systems