Tag
This paper introduces a statistical framework for adaptively auditing AI systems using Safe Anytime-Valid Inference (SAVI) to draw rigorous conclusions with limited data. It proposes a 'testing by betting' approach to validate model robustness while controlling type-I errors during adaptive sampling.