Tag
This paper investigates asymmetries in LLMs' pragmatic competence by comparing their performance as judges of linguistic appropriateness versus as generators of pragmatically appropriate language. The study finds that many models perform substantially better as pragmatic listeners than as speakers, suggesting misalignment between evaluation and generation capabilities.