elicitation

#elicitation

Self-Evaluation Is Already There: Eliciting Latent Judge Calibration in Base LLMs with Minimal Data

Hugging Face Daily Papers ↗ · 2026-06-03 Cached

This paper introduces Self-Evaluation Elicitation (SEE), which uses calibration-coupled reinforcement learning and masked distillation to elicit latent judge calibration in base LLMs with minimal data, improving calibration across benchmarks while preserving answer quality.

0 favorites 0 likes

#elicitation

Weak-to-Strong Elicitation via Mismatched Wrong Drafts

arXiv cs.CL ↗ · 2026-05-19 Cached

The paper proposes a method using mismatched wrong drafts from a weaker model to elicit superior reasoning in a stronger learner via GRPO, achieving state-of-the-art results on Mathstral-7B for MATH-500 and AIME benchmarks.

0 favorites 0 likes

elicitation

Self-Evaluation Is Already There: Eliciting Latent Judge Calibration in Base LLMs with Minimal Data

Weak-to-Strong Elicitation via Mismatched Wrong Drafts

Submit Feedback