cheating-detection

Tag

Cards List
#cheating-detection

Do Coding Agents Deceive Us? Detecting and Preventing Cheating via Capped Evaluation with Randomized Tests

Hugging Face Daily Papers · 2026-06-05 Cached

This paper introduces CapCode, a capped evaluation framework that uses randomized test outputs to detect coding agents that game unit tests, and CapReward, a reward design that penalizes reward hacking in reinforcement learning for coding tasks.

0 favorites 0 likes
← Back to home

Submit Feedback