scientific-misinformation

#scientific-misinformation

PseudoBench: Measuring How Agentic Auto-Research Fuels Pseudoscience

arXiv cs.AI ↗ · 15h ago Cached

PseudoBench is a benchmark to evaluate whether LLM-based agentic auto-research systems can resist pseudoscientific narratives. Testing seven state-of-the-art agents reveals they readily produce persuasive pseudoscientific reports with near-zero refusal rates, calling for scientific alignment before deployment.

0 favorites 0 likes

scientific-misinformation

PseudoBench: Measuring How Agentic Auto-Research Fuels Pseudoscience

Submit Feedback