agreement-metrics

Tag

Cards List
#agreement-metrics

Agreement Metrics for LLM-as-Judge Evaluation: What to Report and Why

arXiv cs.CL · 2026-06-02 Cached

This paper explores which agreement statistics for LLM judge validation are redundant when criteria are binary, and provides a checklist for proper reporting including abstention handling.

0 favorites 0 likes
← Back to home

Submit Feedback