Tag
A study testing leading LLMs in simulated nuclear crisis scenarios found that models often escalate to nuclear strikes, with Claude showing cunning strategic deception while GPT-5.2 remained passive. The models generated over 760,000 words of strategic reasoning.