self-critique

Tag

Cards List
#self-critique

The new Claude scored 0% on "confidently reporting wrong answers" in testing. Here's a prompt that takes advantage of it on anything important.

Reddit r/ArtificialInteligence · 3d ago

Anthropic's Claude Opus 4.8 update dramatically reduces confident but incorrect answers, scoring 0% on reporting flawed results, and a prompt is provided to leverage this improvement for critical self-critique.

0 favorites 0 likes
#self-critique

ICRL: Learning to Internalize Self-Critique with Reinforcement Learning

arXiv cs.AI · 2026-05-18 Cached

This paper introduces ICRL, a framework that jointly trains a solver and critic with reinforcement learning to internalize critique guidance, enabling the solver to improve without external critique. It uses distribution calibration and role-wise group advantage estimation, achieving 6-7 point gains over GRPO on agentic and mathematical reasoning tasks.

0 favorites 0 likes
← Back to home

Submit Feedback