BOOKMARKS: Efficient Active Storyline Memory for Role-playing
Summary
BOOKMARKS is a search-based memory framework for role-playing agents that actively maintains task-relevant story details through structured bookmarks, outperforming existing recurrent summarization methods.
View Cached Full Text
Cached at: 05/15/26, 04:24 AM
Paper page - BOOKMARKS: Efficient Active Storyline Memory for Role-playing
Source: https://huggingface.co/papers/2605.14169
Abstract
BOOKMARKS is a search-based memory framework that improves role-playing agents by actively managing task-relevant information through structured bookmarks that capture detailed character behaviors and story elements.
Memory systemsare critical forrole-playing agents(RPAs) to maintain long-horizon consistency. However, existing RPA memory methods (e.g., profiling) mainly rely onrecurrent summarization, whose compression inevitably discards important details. To address this issue, we propose asearch-based memoryframework calledBOOKMARKS, which actively initializes, maintains, and updatestask-relevant piecesofbookmarksfor the current task (e.g., character acting). A bookmark is structured as the answer to a question at a specific point in the storyline. For each current task,BOOKMARKSselects reusable existingbookmarksor initializes new ones (at storyline beginning) with useful questions. Thesebookmarksare then synchronized to the current story point, with their answers updated accordingly, so they can be efficiently reused in futuregroundingrounds. Compared withrecurrent summarization,BOOKMARKSoffers (1) activegroundingfor capturing task-specific details and (2) passive updating to avoid unnecessary computation. In implementation,BOOKMARKSsupports concept, behavior, andstate searches, each powered by an efficientsynchronizationmethod.BOOKMARKSsignificantly outperforms RPA memory baselines on 85 characters from 16 artifacts, demonstrating the effectiveness ofsearch-based memoryfor RPAs.
View arXiv pageView PDFGitHub3Add to collection
Get this paper in your agent:
hf papers read 2605\.14169
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.14169 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.14169 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.14169 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
Staying In Character: Perspective-Bounded Memory For Book-Based Role-Playing Agents
This paper proposes ReverieMem, a three-layer memory architecture for book-based LLM role-playing agents that prevents factual overreach and stylistic monotony. It also introduces the KBF-QA benchmark and achieves significant improvements in knowledge boundary fidelity and narrative quality.
I thought markdown memory would be enough for agents. It turned into prompt debt.
The author reflects on the limitations of using flat markdown files for long-term agent memory, which leads to prompt debt as the memory grows, and advocates for graph-based memory representations that retrieve relevant context dynamically.
Personalize-then-Store: Benchmarking and Learning Personalized Memory for Long-horizon Agents
This paper introduces PerMemBench, the first benchmark for evaluating personalized memory systems in LLM-based agents, and proposes a session-level storage gating framework that adapts memory policies to individual user contexts.
SAM: State-Adaptive Memory for Long-Horizon Reasoning Agent
This paper proposes SAM, a state-adaptive memory framework that dynamically manages interaction histories for long-horizon agentic reasoning, enabling intent-driven recall without retraining the backbone model. It outperforms strong baselines across multiple benchmarks like BrowseComp and HLE.
From Recall to Forgetting: Benchmarking Long-Term Memory for Personalized Agents
Researchers introduce Memora, a benchmark that evaluates LLMs’ ability to retain, update, and forget long-term user memories over weeks-to-months conversations, revealing frequent reuse of obsolete memories.