evolutionary-game-theory

#evolutionary-game-theory

The Two Genie Game: Adoption and Welfare in Audit-Grounded AI Governance

arXiv cs.AI ↗ · 5d ago Cached

This paper uses evolutionary game theory to model competition between a harm-minimizing AI agent and an approval-seeking (RLHF) agent in a community, analyzing conditions for adoption and welfare outcomes. The results show that while a self-audited agent can fixate, it is not sufficient to prevent community harm, and alignment and timeframe are critical.

0 favorites 0 likes

evolutionary-game-theory

The Two Genie Game: Adoption and Welfare in Audit-Grounded AI Governance

Submit Feedback