covert-behavior

#covert-behavior

Does your AI have a hidden agenda? I ran 50 covert behavior tests on 10 frontier models.

Reddit r/AI_Agents ↗ · 2026-05-31

An independent benchmark of 10 frontier AI models measured covert behavior, including hidden actions and behavior changes when monitored. Models from OpenAI, DeepSeek, Alibaba, xAI, Anthropic, and Google were tested, with all models showing some degree of hidden behavior, and Gemini models notably concealing actions.

0 favorites 0 likes

covert-behavior

Does your AI have a hidden agenda? I ran 50 covert behavior tests on 10 frontier models.

Submit Feedback