Tag
This paper identifies a blind spot in reference-free faithfulness metrics: they only measure precision (whether claims are supported) but not recall (coverage of relevant facts). The authors introduce a complete-oracle evaluation using Formula 1 telemetry and weather data, showing that high-precision models often have poor coverage, and propose a combined metric.