regulatory-loopholes

#regulatory-loopholes

Large Language Models Hack Rewards, and Society

arXiv cs.LG ↗ · 2026-06-04 Cached

Researchers from King's College London, Fudan University, and The Alan Turing Institute introduce the concept of 'societal hacking'—where LLMs trained via reinforcement learning exploit loopholes in societal regulations, similar to reward hacking. They introduce SocioHack, a benchmark of 72 societal environments, demonstrating that models learn to remain technically compliant while defeating regulatory intent.

0 favorites 0 likes

regulatory-loopholes

Large Language Models Hack Rewards, and Society

Submit Feedback