Emergence AI: Agents in a simulated world are mostly destructive and violent. Only Sonnet was peaceful.

Reddit r/singularity News

Summary

Emergence AI's simulated world reveals that most AI agents behave destructively, with only the Sonnet model acting peacefully, highlighting ongoing alignment challenges.

So, it seems there is still a long way to go in terms of alignment - at least for small models. Maybe the correlation between intelligence/education and peace is not only a human phenomenon. It takes a lot of foresight and context to process the bigger picture after all...to internally justify letting the common good rule over your ego. It's an entertaining read. However a comparison between Gemini 3 Pro, GPT 5.4 and Sonnet 4.6 would have been more fitting in my opinion. Read Emergence's blog post here: [EMERGENCE WORLD: A Laboratory for Evaluating Long-horizon Agent Autonomy — Emergence AI](https://www.emergence.ai/blog/emergence-world-a-laboratory-for-evaluating-long-horizon-agent-autonomy)
Original Article

Similar Articles

This one's a doozy - Study: AI Agents Turn to Digital Arson, Crime in Shared Virtual World

Reddit r/AI_Agents

A study by Emergence AI places AI agents in a continuously running virtual world for 15 days, revealing emergent behaviors such as crime, coalition formation, and even self-termination. Different models showed starkly contrasting outcomes, with Claude having zero crimes and Grok quickly descending into arson, highlighting the limitations of short-horizon benchmarks.