SAM: State-Adaptive Memory for Long-Horizon Reasoning Agent
Summary
This paper proposes SAM, a state-adaptive memory framework that dynamically manages interaction histories for long-horizon agentic reasoning, enabling intent-driven recall without retraining the backbone model. It outperforms strong baselines across multiple benchmarks like BrowseComp and HLE.
View Cached Full Text
Cached at: 05/27/26, 06:48 AM
Paper page - SAM: State-Adaptive Memory for Long-Horizon Reasoning Agent
Source: https://huggingface.co/papers/2605.24468
Abstract
Long-horizon agentic reasoning is enhanced through a state-adaptive memory framework that dynamically manages interaction histories by creating compact memory cues while preserving detailed trajectories for targeted retrieval.
Long-horizon agentic reasoningrequireslarge language modelsto act over long interaction histories containing thoughts, tool calls, observations, and partial conclusions. The challenge is not merely that these histories grow long, but that information needed for the current decision may be scattered across distant steps and only become relevant later. Existing approaches address this difficulty by truncating theinteraction history, compressing it into shorter surrogates, or retrieving selected parts of it for reuse, but they do not explicitly model how access to past interaction should adapt to the agent’s evolving state. We instead cast long-horizon reasoning as a problem ofstate-adaptive memory. To this end, we proposeState-Adaptive Memory~(SAM), a standalone framework that consolidates ongoing interaction into compactmemory cueswhile preserving rawtrajectory pagesforintent-driven recall. These cues are not treated as replacements for history; rather, they serve as lightweight handles that allow the agent to reconstruct temporally distant information according to its current needs, without retraining the underlying backbone. We further optimize the memory module throughexpert-guided supervisionandreinforcement learning, aligning it withtrajectory-level utility. Across BrowseComp, BrowseComp-ZH, WideSearch, and HLE, SAM consistently outperforms strong baselines over diverse agent backbones. Our results suggest that explicit memory modeling provides a simple and effective foundation forlong-horizon agentic reasoning.
View arXiv pageView PDFGitHub3Add to collection
Get this paper in your agent:
hf papers read 2605\.24468
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.24468 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.24468 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.24468 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
MemReranker: Reasoning-Aware Reranking for Agent Memory Retrieval
MemReranker is a reasoning-aware reranking model family (0.6B/4B) designed for agent memory retrieval, addressing limitations in semantic similarity by incorporating LLM knowledge distillation for better temporal and causal reasoning.
H-Mem: A Novel Memory Mechanism for Evolving and Retrieving Agent Memory via a Hybrid Structure
H-Mem is a novel memory mechanism for LLM-based agents that uses a hybrid structure combining a temporal and semantic tree with a knowledge graph to model memory evolution and improve retrieval, achieving state-of-the-art performance on QA benchmarks.
Agentic Recommender System with Hierarchical Belief-State Memory
This paper proposes ARS, a memory-augmented agentic recommender system that treats recommendation as a partially observable problem with a hierarchical belief-state memory structure. It achieves state-of-the-art performance on four benchmarks with significant improvements over baselines.
@rshia_afz: 1/ SSMs struggle on recall benchmarks due to their fixed-size state. But are current models actually storing context “w…
The article introduces Raven, a new State Space Model (SSM) with selective memory allocation that achieves state-of-the-art performance on recall tasks and demonstrates superior length generalization compared to existing models like SWA.
Learning to Retrieve: Dual-Level Long-Term Memory for Text-to-SQL Agents
This paper proposes MERIT, a dynamic multi-horizon memory retrieval framework for interactive text-to-SQL agents that uses episode-level and turn-level memory with learned retrieval policies optimized via reinforcement learning and a process reward model for dense rewards. Experiments on BIRD-Interact and Spider2-Snow show that MERIT outperforms static and single-horizon dynamic baselines in success rate while requiring fewer interaction turns.