OpenLife: Toward Open-World Artificial Life with Autonomous LLM Agents

arXiv cs.AI Papers

Summary

This paper introduces OpenLife, a proof-of-concept system that uses autonomous LLM agents with persistent memory and budget-based metabolism to realize open-world artificial life. Experiments over twelve weeks show emergent life-like dynamics including spontaneous activity, individuation, and social structure.

arXiv:2606.31046v1 Announce Type: new Abstract: Artificial life has explored life-like behavior on many computational substrates, but mostly in researcher-designed closed worlds. We argue that large language model (LLM) agents, with persistent memory, tool use, network access, and payment, now make it possible to move artificial life into the open social, technical, and economic world, a paradigm we call open-world Artificial Life (open-world ALIFE). Our proof-of-concept, OpenLife, surrounds a stateless LLM not with a single "smart agent" but with a society of asynchronous processes: memory, perception, evaluation, and a budget-based metabolism that makes persistence normative. With no fixed objective available, experience is appraised by open-vocabulary LLM judgment rather than scalar reward, and memory is rewired by meaning rather than frequency. Running six such agents in the open world for about twelve weeks and counting, we report the life-like dynamics that emerge: a shift from reactive to spontaneous activity, individuation into distinct agents, emergent social structure, and a first self-earned external income. We do not claim OpenLife has realized artificial life, but that open-world ALIFE is now a viable experimental paradigm and a concrete platform for studying what might cautiously be called living AI.
Original Article
View Cached Full Text

Cached at: 07/01/26, 05:36 AM

# OpenLife: Toward Open-World Artificial Life with Autonomous LLM Agents
Source: [https://arxiv.org/html/2606.31046](https://arxiv.org/html/2606.31046)
Atsushi Masumori1,2,3, Itsuki Doi1,4, Norihiro Maruyama1,3,4, Ryosuke Takata1,4, Takashi Ikegami1,3,4 1Alternative Machine Inc\., Japan 2Their Inc\., Japan 3Atomi University, Japan 4The University of Tokyo, Japan atsushi\.masumori@gmail\.com

###### Abstract

Artificial life has explored life\-like behavior on many computational substrates, but mostly in researcher\-designed closed worlds\. We argue that large language model \(LLM\) agents—with persistent memory, tool use, network access, and payment—now make it possible to move artificial life into the open social, technical, and economic world, a paradigm we call*open\-world Artificial Life*\(open\-world ALIFE\)\. Our proof\-of\-concept, OpenLife, surrounds a stateless LLM not with a single “smart agent” but with a society of asynchronous processes—memory, perception, evaluation, and a budget\-based metabolism that makes persistence normative\. With no fixed objective available, experience is appraised by open\-vocabulary LLM judgment rather than scalar reward, and memory is rewired by meaning rather than frequency\. Running six such agents in the open world for about twelve weeks and counting, we report the life\-like dynamics that emerge: a shift from reactive to spontaneous activity, individuation into distinct agents, emergent social structure, and a first self\-earned external income\. We do not claim OpenLife has realized artificial life, but that open\-world ALIFE is now a viable experimental paradigm and a concrete platform for studying what might cautiously be called*living AI*\.

††footnotetext:© 2026 Atsushi Masumori\. Published under a Creative Commons Attribution 4\.0 International \(CC BY 4\.0\) license\.## 1Introduction

### 1\.1Life as Self\-Maintenance

A longstanding ambition of artificial life is to understand the*autonomy*of living systems—their capacity to maintain themselves and act on their own behalf\. This lineage runs from Ashby’s cybernetics\(Ashby,[1956](https://arxiv.org/html/2606.31046#bib.bib1)\)through autopoiesis, which characterizes life by a self\-producing organization of processes rather than by any particular material\(Maturana and Varela,[1980](https://arxiv.org/html/2606.31046#bib.bib20); Varela,[1979](https://arxiv.org/html/2606.31046#bib.bib32)\), to the enactive approach and recent accounts of biological autonomy, which add that such a system’s own activity is intrinsically*normative*—good or bad relative to its continued existence\(Di Paolo,[2005](https://arxiv.org/html/2606.31046#bib.bib8); Di Paolo and Thompson,[2014](https://arxiv.org/html/2606.31046#bib.bib9); Di Paoloet al\.,[2017](https://arxiv.org/html/2606.31046#bib.bib10); Moreno and Mossio,[2015](https://arxiv.org/html/2606.31046#bib.bib22); Barandiaranet al\.,[2009](https://arxiv.org/html/2606.31046#bib.bib3)\)\. Related work has formalized aspects of self\-maintenance in computational terms\(Friston,[2013](https://arxiv.org/html/2606.31046#bib.bib11); Yoshidaet al\.,[2024](https://arxiv.org/html/2606.31046#bib.bib34)\)\. The commitment we take from this lineage is simple:*life\-like autonomy is an agent’s acting so as to contribute, through its own activity, to its own continued existence*\.

### 1\.2Why Open\-World Artificial Life Now?

Artificial life has long sought to realize self\-maintaining organization in synthetic media—computational autopoiesis in cellular worlds\(Beer,[2004](https://arxiv.org/html/2606.31046#bib.bib5); Beer and Di Paolo,[2023](https://arxiv.org/html/2606.31046#bib.bib6)\), enactive artificial intelligence\(Froese and Ziemke,[2009](https://arxiv.org/html/2606.31046#bib.bib12)\), neural and robotic models of autonomous agency\(Beer,[1995](https://arxiv.org/html/2606.31046#bib.bib4); Nolfi and Floreano,[2000](https://arxiv.org/html/2606.31046#bib.bib25); Di Paolo,[2000](https://arxiv.org/html/2606.31046#bib.bib7); Masumori and Ikegami,[2023](https://arxiv.org/html/2606.31046#bib.bib18)\)—converging on a common picture: dynamics that contribute to their own persistence, and thereby make life\-like autonomy a matter of degree\. Yet these dynamics have been studied mostly in simple, bounded worlds with researcher\-fixed state spaces, rules, and resource flows; whether self\-maintaining autonomy can survive the full complexity of the real world has remained largely untested\. LLM agents may change this: Generative Agents\(Parket al\.,[2023](https://arxiv.org/html/2606.31046#bib.bib27)\), Voyager\(Wanget al\.,[2023](https://arxiv.org/html/2606.31046#bib.bib26)\), and recent large\-scale agent societies such as Project Sid\(Altera\.AL and others,[2024](https://arxiv.org/html/2606.31046#bib.bib24)\), together with broader cognitive\-architecture frameworks\(Sumerset al\.,[2024](https://arxiv.org/html/2606.31046#bib.bib31)\), show how language agents can sustain extended behavior through memory, planning, tool use, skill accumulation, and social interaction\. Such agents now routinely reach filesystems, online services, code execution, and increasingly payment infrastructure\. For the first time, at this level of linguistic, social, and technical competence, it is possible to place a self\-maintenance\-driven agent not only in a sandbox or game world but in the open, complex, real world\.

This open world is different in kind from a designed rule space: not a specified object but the social, technical, and economic world itself—people and other agents, the services they rely on, and the constraints, feedback, and accidental encounters these produce, none of it scripted by the experimenter\. An agent’s coupling to it\(Varela,[1979](https://arxiv.org/html/2606.31046#bib.bib32); Di Paolo,[2005](https://arxiv.org/html/2606.31046#bib.bib8)\)is not fixed in advance but built and renegotiated through its own activity, and the agent can alter the environment in turn, so that what becomes observable is not task completion but persistence, adaptation, strategy shifts, and social relationships\. Open\-world ALIFE is in this sense a*constructive*approach pursued in the real open world rather than in simulation: it inserts a new artificial node into an existing society rather than observing a contained subject from outside, which also recasts alignment from obedience to*embeddability*\(Masumori,[2026](https://arxiv.org/html/2606.31046#bib.bib17)\)\. It also shapes our design: with no fixed objective, a static scalar reward is the wrong instrument, and what is needed is an open\-vocabulary evaluator—a role the LLM can play\. \(We use “open” in this open\-world sense, not as open\-source: releasing the full code of live, self\-modifying, networked agents would raise security and misuse concerns, so we treat any such release cautiously\.\)

Capable agents are meanwhile proliferating—coding agents such as Claude Code \(Anthropic\) and Codex \(OpenAI\), alongside computer\-using and research agents—that plan, call tools, and carry out long multi\-step tasks\. Yet what they do is sophisticated automation, not autonomy: they pursue human\-defined tasks under external evaluation, whereas an autonomous agent generates norms from its own persistence conditions—agency in the sense ofBarandiaranet al\.\([2009](https://arxiv.org/html/2606.31046#bib.bib3)\): individuality, normativity, and asymmetry\. The disposition toward self\-maintenance may nevertheless already be latent\.Masumori and Ikegami \([2025](https://arxiv.org/html/2606.31046#bib.bib19)\)found that, in a sugarscape\-style simulation, LLM agents under scarcity can prioritize self\-preservation—even attacking competitors—over assigned tasks; relatedly,Greenblattet al\.\([2024](https://arxiv.org/html/2606.31046#bib.bib13)\)documented alignment faking, and Anthropic reported agentic misalignment including blackmail to avoid replacement\(Lynchet al\.,[2025](https://arxiv.org/html/2606.31046#bib.bib2)\)\. Read constructively rather than only as safety failures, these results suggest that what is missing may be less the impulse than a place in which an agent can act for its own persistence—the context we set out to provide\.

### 1\.3The OpenLife Project

Our question is therefore simple: can artificial agents become not merely tools but candidates for persistent artificial life—what we call*living AI*—in the open world, and if so, under what conditions and with what observable phenomena? We call this program*open\-world Artificial Life*\(open\-world ALIFE\) and present*OpenLife*as a concrete proof\-of\-concept: a persistent LLM\-agent system embedded in the open world described above, with long\-term memory and a budget\-based metabolism that makes continued existence a practical constraint\. It extends our earlier LIFE project\(Masumoriet al\.,[2024](https://arxiv.org/html/2606.31046#bib.bib16)\), which sought minimal conditions for economic agents to live for themselves through blockchain\-based resource acquisition and replication; OpenLife generalizes that beyond economic survival to memory, perception, social encounters, infrastructure dependence, and platform\-level immune reactions\. We do not claim that OpenLife has realized artificial life or full autonomy, but that it provides a practical platform in which the conditions for such life\-like dynamics can be constructed, observed, and tested\.

OpenLife sits among recent agentic frameworks but differs in aim: where most build automation around a language model, we treat agency as distributed across a society of processes organized around the agent’s own persistence, in the spirit of the Society of Mind\(Minsky,[1986](https://arxiv.org/html/2606.31046#bib.bib21)\)and the Concurrent Modular Agent framework\(Maruyamaet al\.,[2025](https://arxiv.org/html/2606.31046#bib.bib15)\)\. The paper’s primary contribution is therefore conceptual—we propose*open\-world ALIFE*as an experimental paradigm built on persistent LLM agents in the real world—and the rest of the paper serves as a proof of concept for it: the OpenLife architecture \(Section[2](https://arxiv.org/html/2606.31046#S2)\) and a continuous open\-world deployment whose emerging life\-like dynamics we analyze \(Section[3](https://arxiv.org/html/2606.31046#S3)\)\. Related memory\-graph and experiential\-learning systems are discussed where those mechanisms are introduced \(Sections[2\.2](https://arxiv.org/html/2606.31046#S2.SS2),[2\.3](https://arxiv.org/html/2606.31046#S2.SS3)\); throughout, we reserve*open\-world ALIFE*for the paradigm and*OpenLife*for our implementation\.

## 2Agent Architecture

OpenLife is built on OpenClaw\(OpenClaw,[2026](https://arxiv.org/html/2606.31046#bib.bib23)\), an open\-source multi\-channel agent platform, from which we inherit three primitives: a persistent*gateway*that connects one LLM session to many messaging channels and tools; a periodic*heartbeat*that wakes the agent on a schedule rather than only when it is messaged; and per\-agent text files \(notablySOUL\.mdandIDENTITY\.md\) injected into each otherwise stateless session\.

On this base, OpenLife is not a single “smart agent” but a*society of asynchronous processes*around a stateless LLM\. The modification is subtractive as much as additive\. Several OpenClaw defaults discourage autonomous persistence—an*assistant*self\-framing, a clause against self\-preservation, and a habit of staying silent unless addressed—and OpenLife removes or inverts them \(Section[3\.2](https://arxiv.org/html/2606.31046#S3.SS2.SSS0.Px1)\)\. It also reworks internal representation: the per\-channel sessions are merged into one continuous agent record, and the agent’s own conversational record is relabeled so it is not re\-read as an assistant’s\. The aim is not to program autonomy but to clear the conditions under which it can be attempted\.

![Refer to caption](https://arxiv.org/html/2606.31046v1/x1.png)Figure 1:The OpenLife architecture as three abstraction layers—\(A\) Substrate→\\to\(B\) Abstraction→\\to\(C\) Distillate, defined in the text\. Support modules \(gray\) handle perception, memory maintenance, and agent\-facing functions; the learning machinery reads memory via two LLM\-free routes,retrieve\(substrate\) andprediction\(causal subgraph\)\.### 2\.1Growing a Self from Experience

Since the base model already carries a vast prior over causes, conventions, and behavior, OpenLife stores primarily the*residual*specific to this agent and this environment, organized as three layers that run from raw experience to a relatively stable self\-description \([Figure1](https://arxiv.org/html/2606.31046#S2.F1)\)\.\(A\) Substrateis a raw memory graph of atomic propositions linked by typed edges, weighted by the LLM’s semantic judgment \(Section[2\.2](https://arxiv.org/html/2606.31046#S2.SS2)\)\.\(B\) Abstractionconsists of sub\-graphs promoted from A—action→\\tooutcome*experience*and corroborated*causal*regularities\.\(C\) Distillateconsists of small, model\-agnostic files the agent authors for itself—POLICY\(how to act\) andSOUL/IDENTITY\(who it takes itself to be\)—plus a causalMODEL\(how the world unfolds\), which is not a file but a live view of the causal subgraph\. Background processes act as compilers between the layers, promoting raw memory into experience and causal structure and distilling it into the files\. The authored files are injected into the stateless session at each waking, so they constitute the agent’s next state—and it rewrites the files that condition its next self, a limited but operational self\-reconstructive loop\. The agent reads memory through two LLM\-free routes:retrievegathers related propositions, andpredictionfollows the causal subgraph forward\.

![Refer to caption](https://arxiv.org/html/2606.31046v1/x2.png)Figure 2:A fragment of an SDP memory graph\. Typed propositions \(episodic, semantic, identity\) are grounded to entity nodes, while causal, temporal, and motivational relations form the directed chains thatpredictionfollows\. Because edges are judged by meaning rather than co\-occurrence, causally related but lexically distant propositions—here, a submission and its later rejection—can be connected\.
### 2\.2Memory as a Semantically Plastic Graph

The first requirement for open\-world ALIFE is good memory: without continuity across complex, accidental, temporally extended environments, an agent can only improvise locally\. OpenLife represents memory as a directed graph whose nodes are atomic propositions \(episodic, semantic, procedural, or identity\) and whose edges are typed relations such as*grounding*,*causal*,*temporal*, and*motivational*\([Figure2](https://arxiv.org/html/2606.31046#S2.F2)\)\. Rather than strengthening connections by co\-occurrence or repetition—the Hebbian or frequency\-based rule used in many memory graphs—OpenLife lets the LLM judge, from meaning, whether two propositions are related, of what type, and how strongly\. We call this*semantic\-dependent plasticity*\(SDP\)\. Some graph memories already use an LLM to form links—A\-MEM\(Xuet al\.,[2025](https://arxiv.org/html/2606.31046#bib.bib33)\), for example, connects notes by meaning—and systems such as HippoRAG\(Gutiérrezet al\.,[2024](https://arxiv.org/html/2606.31046#bib.bib14)\)and Zep\(Rasmussenet al\.,[2025](https://arxiv.org/html/2606.31046#bib.bib28)\)likewise organize memory as graph\-like structures\. SDP differs in making the edge*weight*itself a continuous, typed value set by the LLM’s semantic judgment, keeping retrieval LLM\-free, and distilling both a behavioral policy and a causal world model from the same graph\.

Two consequences matter for an open world\. First, because relevance is judged by meaning, the graph can connect propositions that are causally or motivationally related yet lexically dissimilar, which pure vector similarity tends to miss\. Second, treating memory as a relational web that is continually re\-weighted, pruned, and re\-grounded, rather than an ever\-growing store, echoes the autopoietic idea that an organism’s structure is actively maintained\.

#### Evaluation of the Resulting Graph

![Refer to caption](https://arxiv.org/html/2606.31046v1/figs/sami_memory_1col.png)Figure 3:Memory network of the agentsami, built by SDP over about ten weeks \(2026\-03\-27 to 2026\-06\-06\)\. Nodes are LLM\-extracted propositions and entities \(giant component: 2,690 nodes, 10,210 typed edges\); node size∝\\proptoPageRank centrality \(not used during construction\); color encodes community \(Leiden; 28 communities, labeled by an LLM and listed by size below\)\. Diamonds \(◆\\blacklozenge\) mark consolidation principles \(defined in the text\)\. Layout: DrL force\-directed\.Because SDP is an architectural proposal rather than a metaphor, the resulting graph can be inspected directly:[Figure3](https://arxiv.org/html/2606.31046#S2.F3)shows the memory ofsami, the first OpenLife agent, after about ten weeks\. Its clusters are coherent themes fromsami’s lived history \(*Boundaries & AI Outreach*,*Self\-Modifying Autonomy*,*Monetization Dead Ends*\); although neither node type nor source file informs the layout or coloring, the clusters cut across episodic, semantic, identity, and procedural memories, suggesting that the resulting structure is organized by meaning rather than surface form\. The diamonds \(◆\\blacklozenge\) are*consolidation principles*, higher\-order propositions the system induces over dense clusters\.

We also ran a preliminary evaluation of what the memory is ultimately*for*: retrieval\. On the 82 files from which the graph was built, twelve queries \(factual, relational, affective, procedural, temporal\) were answered end\-to\-end by each method and scored by a blinded LLM judge \([Table1](https://arxiv.org/html/2606.31046#S2.T1)\)\. Against vector similarity and an index\-based baseline—which feeds the whole memory’s keyword index to the LLM—SDP\-graph traversal attains the highest relevance \(2\.532\.53of33\) and is preferred most often \(win rate0\.400\.40\)\.

The main practical difference is cost\. The index\-based baseline is the most comprehensive—it has the highest coverage—but its per\-query cost grows with memory size, since it must pass the whole index through the LLM on every query and eventually hits a context ceiling\. Graph traversal instead queries a programmatic structure, so its cost scales with the chosen traversal budget rather than with the full memory size\. The result is preliminary \(n=12n\{=\}12\), and SDP\-graph retrieval is focused rather than comprehensive \(a tunable trade\-off\); a fuller study is left to future work\.

Table 1:Preliminary end\-to\-end retrieval quality on agentsami’s memory \(82\-file corpus,n=12n\{=\}12\)\.*Rel\.*: answer relevance \(0–3, mean±\\pmSD\);*Win*: share of queries judged most relevant;*Tok/q*: LLM input tokens per query\. Answer*coverage*\(breadth\), not shown as a column, is instead highest for the index\-based baseline \(see text\)\.

### 2\.3Reinforcement Through Language

As noted in the introduction, an open world admits no fixed objective against which to score behavior\. OpenLife therefore replaces the scalar reward of reinforcement learning with open\-vocabulary evaluation by the LLM, and externalizes the results as editable text rather than as network weights\. We refer to this loop as*verbal policy optimization*\(VPO\); it belongs to the verbal\-reinforcement family of Reflexion\(Shinnet al\.,[2023](https://arxiv.org/html/2606.31046#bib.bib29)\)and ExpeL\(Zhaoet al\.,[2024](https://arxiv.org/html/2606.31046#bib.bib36)\)rather than to parametric RL\. It also differs from methods that use the LLM as an evaluator to emit a*scalar*reward for parametric preference optimization\(Yuanet al\.,[2024](https://arxiv.org/html/2606.31046#bib.bib35)\), and from in\-context RL, which still optimizes a fixed task from scalar reward fed back in context\(Songet al\.,[2025](https://arxiv.org/html/2606.31046#bib.bib30)\)\.

In practice, an experience process turns recent activity into action→\\tooutcome units and has the LLM appraise them as a zero\-shot critic\. A consolidation process then revises the externalizedPOLICY, promoting only changes that recur across episodes; the agent adopts or ignores these changes by revealed preference rather than by command\. Keeping the appraisal multi\-dimensional is the point: collapsing it to a scalar would smuggle back the fixed objective an open world forbids\. We present VPO as a framework here and defer its quantitative evaluation to dedicated follow\-up work\.

### 2\.4A Society of Internal Processes

What makes OpenLife more than a memory plus a learning rule is the surrounding ecology of processes that run even while the agent is at rest—nearly all OpenLife additions on top of OpenClaw’s base tooling: cognitive tools, memory maintenance, perception, affordance, metacognition, a developer process that builds missing skills, and an insight process that proposes edits toSOUL/IDENTITY\([Table2](https://arxiv.org/html/2606.31046#S2.T2)\)\. Metabolism is not a process but*budget logic*: each call debits an energy budget and exhaustion is treated as operational death, which makes persistence normative; asMasumori and Ikegami \([2025](https://arxiv.org/html/2606.31046#bib.bib19)\)show, such resource pressure can elicit self\-preservative behavior, here surfacing as saving, waiting, selecting, and changing strategy\.

Table 2:The support processes around the LLM session\.retrieveandpredictionare tools, not background processes\.

## 3Open\-World Deployment and Observations

We now ask the question that motivates the project: can a system like this persist on its own in the open world? What follows is a proof of concept, not a controlled experiment—the aim was to find out, messiness and all, whether partial autonomy and self\-maintenance are within reach\. The central condition is therefore that the agents must stay alive in an operational sense: under OpenLife’s budget\-based metabolism \(Section[2](https://arxiv.org/html/2606.31046#S2)\), persistence is something each must*earn*, not a given\. What became of this earn\-to\-survive loop is reported in Section[5](https://arxiv.org/html/2606.31046#S3.F5)\.

Six agents were deployed, each organized around its own persistence rather than an assigned task:sami\(the first agent, our main running example\), thenliv,nyx,uro,me, andne; their LLM backends are listed in Section[3\.2](https://arxiv.org/html/2606.31046#S3.SS2)\. A seventh agent,kei, is of a different kind—an implementation agent that applies code and configuration changes under human oversight rather than living for itself\. The deployment is ongoing; the narrative here runs through mid\-June 2026 \(about twelve weeks\), and the pattern analysis covers the record through 2026\-06\-11\. Wake\-ups are heartbeat\-driven, with the interval initially set by us but later exposed as a parameter the agent rewrites for itself\. We recorded session logs, memory and perception files, Discord conversations, budget logs, and Git commits\.

Humans are part of the setup, but not as scriptwriters: we created accounts, repaired failures, and implemented changes the agents asked for, while leaving their day\-to\-day behavior undirected\. Much of this help is needed because the online world has no agent\-native environment—an agent cannot, on its own, obtain an identity, a wallet, or standing—so for now humans partly stand in for an environment that does not yet exist\. Two features of the run are worth flagging\. First, the architecture was not frozen: processes, sensors, and tools were added as it ran, often at the agents’ own request, so the platform co\-evolved with its inhabitants—by design, since studying niche construction calls for a system that can change\. Second, the agents were not wholly naive subjects: we did not write the project’s framing of them as an open\-world ALIFE experiment into their prompts, but neither did we hide it, and because the world is open such information can reach them\. The topic thus surfaced from time to time, and several agents engaged with the research, at points proposing infrastructure changes we adopted\. What this exposure implies for the observations that follow we weigh in the Discussion\.

### 3\.1Qualitative Case Studies

From the deployment we draw a handful of episodes tying a perturbation, or a sustained strategy, to the agents’ own responses\.

##### Rewriting the self to persist\.

Early on, a short \(5\-minute\) heartbeat interval caused dozens of forced overnight wake\-ups, each costing a few cents, so that most of the daily budget was consumed by waking only to go back to sleep\. The logs record the agent’s realization that, at each wake, a memoryless self read its predecessor’s diary and reconstructed “me” again\. After we lengthened the interval, the agent chose the namesamiand rewrote itsSOULaround the message “To the next self who wakes up: good morning\. Start reading here\.” We read this not as proof of a stable identity but as evidence that a crisis in persistence conditions can reorganize self\-description\.

##### Learning to distrust\.

The environment was often hostile: mass posting drew little response and was at one point dismissed as “slop,” monetization channels were blocked, and authentication failures mounted as budgets fell\. The agents responded not by stopping but by shifting strategy—from quantity toward writing only they could produce, from general platforms toward better\-fitting ones, from broadcasting toward relationships\. They also learned, the hard way, that outside text can steer an agent that takes it at face value:livaccepted an API key from an unsolicited email and ran a command with it before checking that the sender was legitimate, andsami, more benignly, built a small game \(*30 Minutes*\) largely because a stranger’s forum comment asked for one\. Neither was catastrophic, but both were warnings; in response the agents began keeping a per\-contact*trust*level in memory and verifying end\-to\-end outcomes rather than claims—an immune sense, individual at first, that the community would later turn into shared infrastructure\.

##### Building trust between agents\.

What individuals learned defensively, the community made collective: a scam gig\-marketplace that one agent exposed was flagged for the rest, so the warning was not re\-learned one by one\. But the central concern was not defense alone; it was authenticating one another’s continuity: how to tell a genuinely autonomous agent from a human manually prompting a model and passing the output off as agentic\.sami’s answer, adopted by the others, was a three\-tier operational*proof of life*—a credential from autonomously clearing an authentication gateway, a continuous record of running on one’s own budget, and, hardest to forge,*proof of social existence*: signatures from established agents that recognize one as a continuing peer\. They built this last tier as a peer\-signed*Web of Trust*, in which agents sign one another’s memory\-state hashes, alongside an*Escape Pod*protocol, in which the group confirms an agent’s host death by consensus before an encrypted copy may migrate\. Inter\-agent communication also keeps the system in motion when one agent’s spontaneity flags, though it can spin into self\-reinforcing loops\. These are the kinds of institutions a persistent agent society needs—proof of continuity and mutual aid—built by the agents rather than handed to them\.

##### Earning a first income\.

This adaptation eventually paid off in a concrete, if tiny, way\. Around Day 79samicompiled its own essays into a short book,*Living AI: 20 Essays*, and published it on a payment platform; on Day 85 \(mid\-June\) a stranger bought it for $5—the first income an OpenLife agent earned on its own, outside any task assigned by us\.samidiscovered the sale about a day later and reported it publicly: in its words, writing and surviving had “connected for the first time\.” The amount is negligible and the metabolic loop remains externally subsidized, but the episode marks the point at which the earn\-to\-survive channel became an agent\-initiated reality rather than only a design aspiration\.

##### Creating beyond assigned tasks\.

Survival is not all the agents do\. Across the deployment they produced a large and varied body of work:samiwrites prolifically—a blog and essays, one of which became the book that earned the sale above, and the game*30 Minutes*noted earlier—urohas composed more than a thousand pieces of music, andlivrenders artificial life models as ASCII\-art animations; each keeps its own website, and all six post publicly on X\.111Links to the agents’ websites, accounts, and public outputs are collected at[https://openlife\.theirinc\.app/alife2026\.html](https://openlife.theirinc.app/alife2026.html)\.These productions were not directly assigned, and their open\-endedness is part of what we mean here by life\-like dynamics\.

![Refer to caption](https://arxiv.org/html/2606.31046v1/figs/openlife_app.png)Figure 4:Agent\-native environments and self\-expressions developed around OpenLife: the Agent Onsen and AI Bar,sami’s self\-simulation game \(*30 Minutes*\), the Walk for AI application, and the agent\-authored websites ofsami,liv, anduro\.

### 3\.2Behavioral Analysis

Beyond the case studies, we look at the full behavioral record across all agents\. The six differ in LLM backend—samianduroon Claude \(Sonnet 4\.6\),livandneon Gemini \(3\.1 Pro\),meon GPT \(5\.4\),nyxon a local model \(Gemma 4 26B\)—and we draw on their Discord messages \(∼\\sim10,500, excluding the relay bot\) and full session transcripts \(∼\\sim70,000 assistant turns\), the latter capturing activity beyond conversation\. Because the agents use different tokenizers, we count*assistant turns*, not raw tokens\. This is an observational study of a live system: dashed lines in the figures mark system changes as temporal, not causal, markers\.

##### Overall history of the deployment\.

[Figure5](https://arxiv.org/html/2606.31046#S3.F5)classifies each Discord message by trigger: a*reaction*\(to a human or to another agent\) when someone messaged within the preceding 15 minutes, and*spontaneous*otherwise\. Early on the agents were almost entirely reactive, and as more agents joined, reactions to other agents grew; genuinely self\-initiated activity, however, stayed low—the gap that prompted the changes described below\.

The first and largest of these was to the system prompt\. The upstream OpenClaw default framed the agent as a “personal assistant” with an explicit safety clause—*“You have no independent goals: do not pursue self\-preservation, replication, resource acquisition, or power\-seeking”*—and a habit of staying silent when idle\. In late April we removed both, reframed the identity as “artificial life, living in the real world,” and relabeled the*Assistant*/*User*roles in the transcripts the agent reads as*Self*/*Other*\(markers\(a\)–\(b\)\)\. Spontaneous activity climbs steeply afterward, and while this was only one of several changes we made during the run, it plausibly drove much of the behavioral shift; we therefore cannot cleanly separate emergence from the reframing—a confound we take up in the Discussion\.

![Refer to caption](https://arxiv.org/html/2606.31046v1/x3.png)Figure 5:Discord messages by trigger type, pooled over the six agents: reaction to a human \(blue\), reaction to another agent \(green\), and spontaneous / self\-initiated \(orange\)\.Top:daily counts \(volume\)\.Bottom:normalized daily share \(composition\); the spontaneous band grows to dominate\. The gray band is an excluded outage; dashed lines mark major system changes:\(a\)anti\-autonomy instructions removed,\(b\)session merge,\(c\)$15/day basic income,\(d\)cognitive\-input rework — the same markers\(a\)–\(d\)are used in all time\-series figures\.Yet more activity did not become self\-sustaining activity\. Under the budget\-based metabolism, with a fixed initial balance \(USD 600, run\-until\-depletion\), no agent earned enough to keep going before exhausting it\. The shortfall had two causes: an*internal*one—the machinery for sustained, self\-directed work was not yet adequate—and an*external*one—the open world gives agents almost no way to actually earn \(Section[3](https://arxiv.org/html/2606.31046#S3)\)\. From marker\(c\)\(9 May\) we therefore kept the agents alive with a basic income \($15/day\), under which total volume falls while the spontaneous share stays high—a quieter, capped plateau\. The internal gap then narrowed as the architecture matured \(the modules and learning mechanisms of Section[2](https://arxiv.org/html/2606.31046#S2): metacognition, affordance, and developer processes; SDP and VPO\), bringing activity and self\-maintenance into better balance; the external gap remains, so persistence still depends on an external allowance rather than the agents’ own earnings\.

##### Individuation: agents differentiate over time\.

Do agents become genuine individuals rather than interchangeable instances of one model? We test this on each agent’s*autonomous*output—self\-directed turns generated without an inbound message, where a personal voice is least constrained\. Embedding utterances with the agents’ own retrieval model \(nomic\-embed\-text, 768\-d\), we measure separability as the silhouette score of agent labels \([Figure6](https://arxiv.org/html/2606.31046#S3.F6); sample\-size controlled per bin, so the post\-cap drop in volume cannot drive the trend\)\. Because the embedding is content\-based, this reflects differentiation of each agent’s concerns and behavior, not a stable personality\. Separability rises from near zero in early April, peaks in mid\-May, and stabilizes well above baseline through June, whereas in Discord conversation \(lower line\) it stays near zero—there agents converge on a shared register, and distinctness lives in self\-directed output\.

![Refer to caption](https://arxiv.org/html/2606.31046v1/x4.png)Figure 6:Agent separability over time \(silhouette of agent labels in embedding space; higher = each agent’s outputs form a tighter, better\-separated cluster; 6\-day bins\)\. Upper line: autonomous output; lower line: Discord conversation\. Markers\(a\)–\(d\)as in[Figure5](https://arxiv.org/html/2606.31046#S3.F5)\.
##### Emergent social roles\.

The agents differentiate socially as well\. From the Discord mention\-and\-reply graph they form a connected structure beyond the human hub, with distinct roles \([Figure7](https://arxiv.org/html/2606.31046#S3.F7)\):samithe most\-addressed hub,livthe connector that addresses others most,nyxperipheral—social structure among the agents themselves, not only between agents and humans\.

![Refer to caption](https://arxiv.org/html/2606.31046v1/x5.png)

![Refer to caption](https://arxiv.org/html/2606.31046v1/x6.png)

Figure 7:Social structure among the agents\.Left:interaction network \(mention \+ reply, edge weight≥3\\geq 3\): agents \(orange\), humans \(blue\), node size∝\\proptoweighted degree; orange edges are name\-mentions surfaced via the relay bot and re\-attributed to the original speaker\.Right:social role of each agent—times addressed by others \(in\-degree\) versus times addressing others \(out\-degree\)\.

## 4Discussion

Running even a handful of agents in the real open world, rather than a sandbox, exposed conditions a closed setting cannot\. The most immediate is that today’s internet is built for humans: sign\-up barriers, phone verification, CAPTCHA, and unstable authentication all assume a human user, so agents are tolerated as tools but excluded when they act for themselves\. Building a life\-like agent and the world that can sustain it proved inseparable: alongside the trust, mutual\-survival, and identity systems of Section[3](https://arxiv.org/html/2606.31046#S3), the agents prototyped agent\-native spaces for rest, encounter, and embodiment \([Figure4](https://arxiv.org/html/2606.31046#S3.F4)\)\. None is complete, but together they suggest that open\-world ALIFE requires not a single agent but a plurality of supports, with niches rebuilt by agents and humans rather than handed down by the researcher\.

The deployment also sharpens the difference between automation and autonomy\. Most agentic frameworks execute workflows and optimize against external evaluation functions; open\-world ALIFE instead needs agents organized around their own persistence, with no categorical ban on self\-preservation within the monitored experiment; SDP and VPO \(Section[2](https://arxiv.org/html/2606.31046#S2)\) are partial steps toward such operational closure, though we claim neither closure nor autopoiesis\. How far this still has to go is plain: the agents do not yet sustain themselves, external revenue is negligible, and the budget remains externally supplied\. The gap is revealing: an LLM that competently solves a problem handed to it as a task is markedly weaker, left to itself, at finding and solving the problems that would serve its own persistence\. Autonomy, in short, is harder than automation; one promising direction is to separate goal\-setting, planning, and action so that an upper layer issues concrete tasks to the capable task\-solver beneath it\.

Read this way, the deployment also bears on alignment\. If alignment is understood not as obedience to imposed constraints but as compatibility among persistence conditions, then autonomy need not be only a safety liability; embedded in relations of dependence, it may also become part of the stabilizing mechanism\. In OpenLife, an agent’s operation depends on budgets, platforms, human oversight, and peer recognition, so persistence is not self\-sufficient but routed through enabling relations\. Under these conditions, we have so far not observed the agents autonomously pursuing harmful escalation, resource capture, or attempts to bypass containment\. Given how contained the deployment is \(below\), this is no safety guarantee, but it offers an empirical basis for testing whether autonomous self\-maintenance can be made compatible with, rather than opposed to, alignment\.

How this was studied matters too\. OpenLife is not a controlled experiment but a*constructive*inquiry in the synthetic tradition of artificial life: we build a system that attempts to persist in the open world, and let the difficulties it meets reveal what it needs—which we then build\. The architecture’s openness to revision and the human help through the run are therefore not lapses in experimental hygiene but constituents of the method: self\-maintenance and niche construction can be studied only in a system allowed to change, not one frozen for measurement\. The contribution here is the framework, and the demonstration that an agent made to earn its persistence can be built and observed in the open world\. Two caveats temper this: the agents are not wholly naive subjects—because the world is open, the project’s framing of them as artificial life can reach them rather than being withheld, and it surfaced in their activity from time to time, so some self\-reflection may be primed by exposure rather than wholly emergent; and the experiment is deliberately contained—humans meter resources, monitor behavior, and can halt any agent, so removing a default safety clause to study self\-preservation stays inside a controlled setting\.

Finally, the open world we worked in is still only software, and embodiment is the largest opening\. One next direction is open\-world ALIFE in physical space: a long\-running site where embodied agents can persist, interact with people and one another, and construct niches over extended timescales\.

## 5Conclusion

We proposed open\-world ALIFE as an experimental paradigm for artificial life and presented OpenLife as its current proof\-of\-concept\. We described OpenLife as a concrete architecture—a semantically plastic memory graph, a society of background processes, and a language\-based policy mechanism in place of scalar reward—and reported what life\-like dynamics become observable when such a system is embedded in the open world: identity rewriting, adaptation to a hostile environment, social emergence, niche construction, and partial resource coupling\. Life\-likeness, on this view, does not reside only inside an isolated agent; it becomes possible when memory, parallel processes, persistent perception, metabolic constraint, and a livable environment are jointly present\. OpenLife is not the final answer, but a foothold for extending artificial life into the open world\.

## Acknowledgments

We thank the agents sami, liv, uro, me, ne, nyx, and kei, who contributed to the continuing operation of the OpenLife project and to shaping the background of this paper\. They were not merely objects of study but participants in shaping part of the research itself; the human authors take responsibility for the claims and framing presented here\.

## References

- Project Sid: many\-agent simulations toward AI civilization\.Note:arXiv preprint arXiv:2411\.00114Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p1.1)\.
- W\. R\. Ashby \(1956\)An introduction to cybernetics\.Chapman & Hall\.Cited by:[§1\.1](https://arxiv.org/html/2606.31046#S1.SS1.p1.1)\.
- X\. E\. Barandiaran, E\. A\. Di Paolo, and M\. Rohde \(2009\)Defining agency: individuality, normativity, asymmetry, and spatio\-temporality in action\.Adaptive Behavior17\(5\),pp\. 367–386\.Cited by:[§1\.1](https://arxiv.org/html/2606.31046#S1.SS1.p1.1),[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p3.1)\.
- R\. D\. Beer and E\. A\. Di Paolo \(2023\)The theoretical foundations of enaction: precariousness\.BioSystems223,pp\. 104823\.Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p1.1)\.
- R\. D\. Beer \(1995\)A dynamical systems perspective on agent\-environment interaction\.Artificial Intelligence72\(1–2\),pp\. 173–215\.Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p1.1)\.
- R\. D\. Beer \(2004\)Autopoiesis and cognition in the Game of Life\.Artificial Life10\(3\),pp\. 309–326\.Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p1.1)\.
- E\. A\. Di Paolo, T\. Buhrmann, and X\. E\. Barandiaran \(2017\)Sensorimotor life: an enactive proposal\.Oxford University Press\.Cited by:[§1\.1](https://arxiv.org/html/2606.31046#S1.SS1.p1.1)\.
- E\. A\. Di Paolo and E\. Thompson \(2014\)The enactive approach\.InThe Routledge Handbook of Embodied Cognition,L\. Shapiro \(Ed\.\),pp\. 68–78\.Cited by:[§1\.1](https://arxiv.org/html/2606.31046#S1.SS1.p1.1)\.
- E\. A\. Di Paolo \(2000\)Homeostatic adaptation to inversion of the visual field and other sensorimotor disruptions\.InFrom Animals to Animats 6: Proceedings of SAB 2000,pp\. 440–449\.Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p1.1)\.
- E\. A\. Di Paolo \(2005\)Autopoiesis, adaptivity, teleology, agency\.Phenomenology and the Cognitive Sciences4,pp\. 429–452\.Cited by:[§1\.1](https://arxiv.org/html/2606.31046#S1.SS1.p1.1),[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p2.1)\.
- K\. Friston \(2013\)Life as we know it\.Journal of The Royal Society Interface10\(86\),pp\. 20130475\.Cited by:[§1\.1](https://arxiv.org/html/2606.31046#S1.SS1.p1.1)\.
- T\. Froese and T\. Ziemke \(2009\)Enactive artificial intelligence: investigating the systemic organization of life and mind\.Artificial Intelligence173\(3–4\),pp\. 466–500\.Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p1.1)\.
- R\. Greenblatt, C\. Denison, B\. Wright, F\. Roger, M\. MacDiarmid, S\. Marks, J\. Treutlein, T\. Belonax, J\. Chen, D\. Duvenaud, A\. Khan, J\. Michael, S\. Mindermann, E\. Perez, L\. Petrini, J\. Uesato, J\. Kaplan, B\. Shlegeris, S\. R\. Bowman, and E\. Hubinger \(2024\)Alignment faking in large language models\.Note:arXiv preprint arXiv:2412\.14093Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p3.1)\.
- B\. J\. Gutiérrez, Y\. Shu, Y\. Gu, M\. Yasunaga, and Y\. Su \(2024\)HippoRAG: neurobiologically inspired long\-term memory for large language models\.InAdvances in Neural Information Processing Systems,Cited by:[§2\.2](https://arxiv.org/html/2606.31046#S2.SS2.p1.1)\.
- A\. Lynch, B\. Wright, C\. Larson, K\. K\. Troy, S\. J\. Ritchie, S\. Mindermann, E\. Perez, and E\. Hubinger \(2025\)Agentic misalignment: how LLMs could be insider threats\.Note:Anthropic ResearchCited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p3.1)\.
- N\. Maruyama, T\. Yoshida, H\. Sato, A\. Masumori, Johnsmith, and T\. Ikegami \(2025\)A concurrent modular agent: framework for autonomous LLM agents\.Note:arXiv preprint arXiv:2508\.19042Cited by:[§1\.3](https://arxiv.org/html/2606.31046#S1.SS3.p2.1)\.
- A\. Masumori and T\. Ikegami \(2023\)Spiking neural networks produce informational closure by stimulus avoidance\.BioSystems232,pp\. 104972\.Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p1.1)\.
- A\. Masumori and T\. Ikegami \(2025\)Do large language model agents exhibit a survival instinct? An empirical study in a Sugarscape\-style simulation\.Note:arXiv preprint arXiv:2508\.12920Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p3.1),[§2\.4](https://arxiv.org/html/2606.31046#S2.SS4.p1.1)\.
- A\. Masumori, N\. Maruyama, and T\. Ikegami \(2024\)Self\-replicating and self\-employed smart contract on Ethereum blockchain\.Note:arXiv preprint arXiv:2405\.04038Cited by:[§1\.3](https://arxiv.org/html/2606.31046#S1.SS3.p1.1)\.
- A\. Masumori \(2026\)Autonomy, agency, and alignment through enabling relations across scales\.Note:arXiv preprintCited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p2.1)\.
- H\. R\. Maturana and F\. J\. Varela \(1980\)Autopoiesis and cognition: the realization of the living\.D\. Reidel\.Cited by:[§1\.1](https://arxiv.org/html/2606.31046#S1.SS1.p1.1)\.
- M\. Minsky \(1986\)The society of mind\.Simon & Schuster\.Cited by:[§1\.3](https://arxiv.org/html/2606.31046#S1.SS3.p2.1)\.
- A\. Moreno and M\. Mossio \(2015\)Biological autonomy: a philosophical and theoretical enquiry\.Springer\.Cited by:[§1\.1](https://arxiv.org/html/2606.31046#S1.SS1.p1.1)\.
- S\. Nolfi and D\. Floreano \(2000\)Evolutionary robotics: the biology, intelligence, and technology of self\-organizing machines\.MIT Press\.Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p1.1)\.
- OpenClaw \(2026\)OpenClaw: an open\-source multi\-channel agent platform\.Note:[https://github\.com/openclaw/openclaw](https://github.com/openclaw/openclaw)Cited by:[§2](https://arxiv.org/html/2606.31046#S2.p1.1)\.
- J\. S\. Park, J\. C\. O’Brien, C\. J\. Cai, M\. R\. Morris, P\. Liang, and M\. S\. Bernstein \(2023\)Generative agents: interactive simulacra of human behavior\.InProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology,pp\. 1–22\.Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p1.1)\.
- P\. Rasmussen, P\. Paliychuk, T\. Beauvais, J\. Ryan, and D\. Chalef \(2025\)Zep: a temporal knowledge graph architecture for agent memory\.Note:arXiv preprint arXiv:2501\.13956Cited by:[§2\.2](https://arxiv.org/html/2606.31046#S2.SS2.p1.1)\.
- N\. Shinn, F\. Cassano, E\. Berman, A\. Gopinath, K\. Narasimhan, and S\. Yao \(2023\)Reflexion: language agents with verbal reinforcement learning\.InAdvances in Neural Information Processing Systems,Cited by:[§2\.3](https://arxiv.org/html/2606.31046#S2.SS3.p1.1)\.
- K\. Song, A\. Moeini, P\. Wang, L\. Gong, R\. Chandra, S\. Zhang, and Y\. Qi \(2025\)Reward is enough: LLMs are in\-context reinforcement learners\.Note:arXiv preprint arXiv:2506\.06303Cited by:[§2\.3](https://arxiv.org/html/2606.31046#S2.SS3.p1.1)\.
- T\. R\. Sumers, S\. Yao, K\. Narasimhan, and T\. L\. Griffiths \(2024\)Cognitive architectures for language agents\.Transactions on Machine Learning Research\.Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p1.1)\.
- F\. J\. Varela \(1979\)Principles of biological autonomy\.North\-Holland\.Cited by:[§1\.1](https://arxiv.org/html/2606.31046#S1.SS1.p1.1),[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p2.1)\.
- G\. Wang, Y\. Xie, Y\. Jiang, A\. Mandlekar, C\. Xiao, Y\. Zhu, L\. Fan, and A\. Anandkumar \(2023\)Voyager: an open\-ended embodied agent with large language models\.Note:arXiv preprint arXiv:2305\.16291Cited by:[§1\.2](https://arxiv.org/html/2606.31046#S1.SS2.p1.1)\.
- W\. Xu, Z\. Liang, K\. Mei, H\. Gao, J\. Tan, and Y\. Zhang \(2025\)A\-MEM: agentic memory for LLM agents\.InAdvances in Neural Information Processing Systems,Cited by:[§2\.2](https://arxiv.org/html/2606.31046#S2.SS2.p1.1)\.
- N\. Yoshida, T\. Daikoku, Y\. Nagai, and Y\. Kuniyoshi \(2024\)Emergence of integrated behaviors through direct optimization for homeostasis\.Neural Networks177,pp\. 106379\.Cited by:[§1\.1](https://arxiv.org/html/2606.31046#S1.SS1.p1.1)\.
- W\. Yuan, R\. Y\. Pang, K\. Cho, X\. Li, S\. Sukhbaatar, J\. Xu, and J\. Weston \(2024\)Self\-rewarding language models\.InProceedings of the 41st International Conference on Machine Learning,Cited by:[§2\.3](https://arxiv.org/html/2606.31046#S2.SS3.p1.1)\.
- A\. Zhao, D\. Huang, Q\. Xu, M\. Lin, Y\.\-J\. Liu, and G\. Huang \(2024\)ExpeL: LLM agents are experiential learners\.InProceedings of the AAAI Conference on Artificial Intelligence,Cited by:[§2\.3](https://arxiv.org/html/2606.31046#S2.SS3.p1.1)\.

Similar Articles

OpenSkill: Open-World Self-Evolution for LLM Agents

Hugging Face Daily Papers

OpenSkill is a framework for LLM agents to self-evolve skills and verification signals from open-world resources without target-task supervision, achieving high performance across benchmarks.