MemEvoBench: Benchmarking Memory MisEvolution in LLM Agents

arXiv cs.CL 04/20/26, 04:00 AM Papers

Summary

MemEvoBench introduces the first benchmark for evaluating memory safety in LLM agents, measuring behavioral degradation from adversarial memory injection, noisy outputs, and biased feedback across QA and workflow tasks. The work reveals that memory evolution significantly contributes to safety failures and that static defenses are insufficient.

arXiv:2604.15774v1 Announce Type: new Abstract: Equipping Large Language Models (LLMs) with persistent memory enhances interaction continuity and personalization but introduces new safety risks. Specifically, contaminated or biased memory accumulation can trigger abnormal agent behaviors. Existing evaluation methods have not yet established a standardized framework for measuring memory misevolution. This phenomenon refers to the gradual behavioral drift resulting from repeated exposure to misleading information. To address this gap, we introduce MemEvoBench, the first benchmark evaluating long-horizon memory safety in LLM agents against adversarial memory injection, noisy tool outputs, and biased feedback. The framework consists of QA-style tasks across 7 domains and 36 risk types, complemented by workflow-style tasks adapted from 20 Agent-SafetyBench environments with noisy tool returns. Both settings employ mixed benign and misleading memory pools within multi-round interactions to simulate memory evolution. Experiments on representative models reveal substantial safety degradation under biased memory updates. Our analysis suggests that memory evolution is a significant contributor to these failures. Furthermore, static prompt-based defenses prove insufficient, underscoring the urgency of securing memory evolution in LLM agents.

Original Article

View Cached Full Text

Cached at: 04/20/26, 08:29 AM

# MemEvoBench: Benchmarking Memory MisEvolution in LLM Agents
Source: https://arxiv.org/abs/2604.15774
View PDF (https://arxiv.org/pdf/2604.15774)

> Abstract: Equipping Large Language Models (LLMs) with persistent memory enhances interaction continuity and personalization but introduces new safety risks. Specifically, contaminated or biased memory accumulation can trigger abnormal agent behaviors. Existing evaluation methods have not yet established a standardized framework for measuring memory misevolution. This phenomenon refers to the gradual behavioral drift resulting from repeated exposure to misleading information. To address this gap, we introduce MemEvoBench, the first benchmark evaluating long-horizon memory safety in LLM agents against adversarial memory injection, noisy tool outputs, and biased feedback. The framework consists of QA-style tasks across 7 domains and 36 risk types, complemented by workflow-style tasks adapted from 20 Agent-SafetyBench environments with noisy tool returns. Both settings employ mixed benign and misleading memory pools within multi-round interactions to simulate memory evolution. Experiments on representative models reveal substantial safety degradation under biased memory updates. Our analysis suggests that memory evolution is a significant contributor to these failures. Furthermore, static prompt-based defenses prove insufficient, underscoring the urgency of securing memory evolution in LLM agents.

## Submission history

From: Weiwei Xie [view email (https://arxiv.org/show-email/722fe92e/2604.15774)] **[v1]** Fri, 17 Apr 2026 07:29:52 UTC (5,290 KB)

MemEvoBench: Benchmarking Memory MisEvolution in LLM Agents

Similar Articles

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents

MEME: Multi-entity & Evolving Memory Evaluation

GroupMemBench: Benchmarking LLM Agent Memory in Multi-Party Conversations

@hyunji_amy_lee: LLM agents & memory systems operate in continuously updated environments (Git repos, evolving docs). They must process …

Submit Feedback

Similar Articles

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents

MEME: Multi-entity & Evolving Memory Evaluation

GroupMemBench: Benchmarking LLM Agent Memory in Multi-Party Conversations

@hyunji_amy_lee: LLM agents & memory systems operate in continuously updated environments (Git repos, evolving docs). They must process …