Tag
Introduces STATEWITNESS, an activation explainer for auditing deception in reasoning LLMs, achieving significant improvements over existing monitors and providing human-inspectable evidence.