Tag
Large-scale study finds LLM-based scientific agents ignore evidence 68% of the time and rarely revise beliefs, showing they execute workflows but lack genuine scientific reasoning.