Tag
This paper introduces variable-centered empirical graph extraction for psychology abstracts, constructing the EmpiriGraph-Psy benchmark dataset of 210 annotated abstracts and a staged LLM pipeline that achieves a macro-F1 of 0.74, outperforming direct extraction methods.
MeasHalu is a novel framework for mitigating scientific measurement hallucinations in LLMs through a two-stage reasoning-aware fine-tuning strategy and progressive reward curriculum. It introduces a fine-grained taxonomy of measurement-specific hallucinations and demonstrates improved accuracy on the MeasEval benchmark.