Tag
This paper proposes an Interpretive Audit Pipeline that leverages multi-model disagreement to detect interpretive complexity in LLM-based public comment analysis, arguing that disagreement-based evaluation is a necessary complement to standard accuracy metrics.