Tag
CP-Agent presents a calibrated risk-controlled approach for feedback-driven competitive programming using large language models, achieving significant improvements on benchmarks without parameter updates.
Introduces Conformal Selective Acting (CSA), a deployment-time wrapper for RLVR-trained LLMs that provides anytime-valid selective risk control on individual streams, enabling safe deployment in regulated settings without pooling or long-run averages.