rubrics

Tag

Cards List
#rubrics

PReMISE: Policy Rubrics as Measurement Specifications for LLM Judges

arXiv cs.AI · 3d ago Cached

Introduces PReMISE, a framework for discovering and auditing policy-level rubrics for LLM judges along four axes: structural adequacy, reliability, preference fit, and adversarial robustness.

0 favorites 0 likes
← Back to home

Submit Feedback