latent-evaluator

#latent-evaluator

Judge Circuits

arXiv cs.CL ↗ · 22h ago Cached

This paper investigates the internal mechanisms of LLM-as-a-judge, finding a shared Latent Evaluator sub-graph in mid-to-late MLPs across models that handles abstract judging, while format-specific terminal branches map the judgment to output tokens, revealing the cause of format-induced inconsistency.

0 favorites 0 likes

latent-evaluator

Judge Circuits

Submit Feedback