Tag
This paper introduces Self-Evaluation Elicitation (SEE), which uses calibration-coupled reinforcement learning and masked distillation to elicit latent judge calibration in base LLMs with minimal data, improving calibration across benchmarks while preserving answer quality.