scientific-misinformation

Tag

Cards List
#scientific-misinformation

PseudoBench: Measuring How Agentic Auto-Research Fuels Pseudoscience

arXiv cs.AI · 17h ago Cached

PseudoBench is a benchmark to evaluate whether LLM-based agentic auto-research systems can resist pseudoscientific narratives. Testing seven state-of-the-art agents reveals they readily produce persuasive pseudoscientific reports with near-zero refusal rates, calling for scientific alignment before deployment.

0 favorites 0 likes
← Back to home

Submit Feedback