BOOKMARKS: Efficient Active Storyline Memory for Role-playing
Summary
BOOKMARKS is a search-based memory framework for role-playing agents that actively maintains task-relevant story details through structured bookmarks, outperforming existing recurrent summarization methods.
View Cached Full Text
Cached at: 05/15/26, 04:24 AM
Paper page - BOOKMARKS: Efficient Active Storyline Memory for Role-playing
Source: https://huggingface.co/papers/2605.14169
Abstract
BOOKMARKS is a search-based memory framework that improves role-playing agents by actively managing task-relevant information through structured bookmarks that capture detailed character behaviors and story elements.
Memory systemsare critical forrole-playing agents(RPAs) to maintain long-horizon consistency. However, existing RPA memory methods (e.g., profiling) mainly rely onrecurrent summarization, whose compression inevitably discards important details. To address this issue, we propose asearch-based memoryframework calledBOOKMARKS, which actively initializes, maintains, and updatestask-relevant piecesofbookmarksfor the current task (e.g., character acting). A bookmark is structured as the answer to a question at a specific point in the storyline. For each current task,BOOKMARKSselects reusable existingbookmarksor initializes new ones (at storyline beginning) with useful questions. Thesebookmarksare then synchronized to the current story point, with their answers updated accordingly, so they can be efficiently reused in futuregroundingrounds. Compared withrecurrent summarization,BOOKMARKSoffers (1) activegroundingfor capturing task-specific details and (2) passive updating to avoid unnecessary computation. In implementation,BOOKMARKSsupports concept, behavior, andstate searches, each powered by an efficientsynchronizationmethod.BOOKMARKSsignificantly outperforms RPA memory baselines on 85 characters from 16 artifacts, demonstrating the effectiveness ofsearch-based memoryfor RPAs.
View arXiv pageView PDFGitHub3Add to collection
Get this paper in your agent:
hf papers read 2605\.14169
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.14169 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.14169 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.14169 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
From Recall to Forgetting: Benchmarking Long-Term Memory for Personalized Agents
Researchers introduce Memora, a benchmark that evaluates LLMs’ ability to retain, update, and forget long-term user memories over weeks-to-months conversations, revealing frequent reuse of obsolete memories.
MemReread: Enhancing Agentic Long-Context Reasoning via Memory-Guided Rereading
MemReread introduces a method for long-context reasoning that avoids intermediate retrieval by decomposing questions and rereading text to recover discarded information, achieving linear time complexity. It outperforms baseline frameworks on long-context reasoning tasks.
MemReranker: Reasoning-Aware Reranking for Agent Memory Retrieval
MemReranker is a reasoning-aware reranking model family (0.6B/4B) designed for agent memory retrieval, addressing limitations in semantic similarity by incorporating LLM knowledge distillation for better temporal and causal reasoning.
I built a benchmark for AI “memory” in coding agents. looking for others to beat it.
Developer created a new benchmark called continuity-benchmarks to test AI coding agents' ability to maintain consistency with project rules during active development, addressing gaps in existing memory benchmarks that focus on semantic recall rather than real-time architectural consistency and multi-session behavior.
LongMemEval-V2: Evaluating Long-Term Agent Memory Toward Experienced Colleagues
This paper introduces LongMemEval-V2, a benchmark for evaluating long-term memory systems in web agents, along with two memory methods: AgentRunbook-R and AgentRunbook-C.