To Know is to Construct: Schema-Constrained Generation for Agent Memory

arXiv cs.CL 04/23/26, 04:00 AM Papers
Summary
UnionPay researchers propose SCG-MEM, a schema-constrained generative memory architecture that eliminates structural hallucinations by forcing LLMs to decode only valid memory keys within a dynamic cognitive schema, outperforming dense-retrieval baselines on the LoCoMo benchmark.
arXiv:2604.20117v1 Announce Type: new Abstract: Constructivist epistemology argues that knowledge is actively constructed rather than passively copied. Despite the generative nature of Large Language Models (LLMs), most existing agent memory systems are still based on dense retrieval. However, dense retrieval heavily relies on semantic overlap or entity matching within sentences. Consequently, embeddings often fail to distinguish instances that are semantically similar but contextually distinct, introducing substantial noise by retrieving context-mismatched entries. Conversely, directly employing open-ended generation for memory access risks "Structural Hallucination" where the model generates memory keys that do not exist in the memory, leading to lookup failures. Inspired by this epistemology, we posit that memory is fundamentally organized by cognitive schemas, and valid recall must be a generative process performed within these schematic structures. To realize this, we propose SCG-MEM, a schema-constrained generative memory architecture. SCG-MEM reformulates memory access as Schema-Constrained Generation. By maintaining a dynamic Cognitive Schema, we strictly constrain LLM decoding to generate only valid memory entry keys, providing a formal guarantee against structural hallucinations. To support long-term adaptation, we model memory updates via assimilation (grounding inputs into existing schemas) and accommodation (expanding schemas with novel concepts). Furthermore, we construct an Associative Graph to enable multi-hop reasoning through activation propagation. Experiments on the LoCoMo benchmark show that SCG-MEM substantially improves performance across all categories over retrieval-based baselines.
Original Article Export to Word Export to PDF
View Cached Full Text
Cached at: 04/23/26, 10:03 AM
# To Know is to Construct: Schema-Constrained Generation for Agent Memory
Source: [https://arxiv.org/html/2604.20117](https://arxiv.org/html/2604.20117)
Weinan Song1Daili Li1&Yanming Yang1 1UnionPay zhenglei2@unionpay\.com, playinlife@126\.com, lidaili@unionpay\.com, ymyang@unionpay\.com

###### Abstract

Constructivist epistemology argues that knowledge is actively constructed rather than passively copied\. Despite the generative nature of Large Language Models \(LLMs\), most existing agent memory systems are still based on dense retrieval\. However, dense retrieval heavily relies on semantic overlap or entity matching within sentences\. Consequently, embeddings often fail to distinguish instances that are semantically similar but contextually distinct, introducing substantial noise by retrieving context\-mismatched entries\. Conversely, directly employing open\-ended generation for memory access risks ”Structural Hallucination”—where the model generates memory keys that do not exist in the memory, leading to lookup failures\. Inspired by this epistemology, we posit that memory is fundamentally organized by cognitive schemas, and valid recall must be a generative process performed within these schematic structures\. To realize this, we propose SCG\-MEM, a schema\-constrained generative memory architecture\. SCG\-MEM reformulates memory access as Schema\-Constrained Generation\. By maintaining a dynamic Cognitive Schema, we strictly constrain LLM decoding to generate only valid memory entry keys, providing a formal guarantee against structural hallucinations\. To support long\-term adaptation, we model memory updates via assimilation \(grounding inputs into existing schemas\) and accommodation \(expanding schemas with novel concepts\)\. Furthermore, we construct an Associative Graph to enable multi\-hop reasoning through activation propagation\. Experiments on the LoCoMo benchmark show that SCG\-MEM substantially improves performance across all categories over retrieval\-based baselines\.

![Refer to caption](https://arxiv.org/html/2604.20117v1/x1.png)Figure 1:Comparison of Memory Access Paradigms\.\(a\) Dense Retrieval:Encodes queries and memory into vectors and retrieves top\-kkentries via similarity matching\. While structurally safe \(k^∈𝒮\\hat\{k\}\\in\\mathcal\{S\}\), it suffers from the semantic gap where nearest neighbors may be contextually irrelevant\.\(b\) Unconstrained Generative Memory:Directly prompts the LLM to generate memory keys\. This approach risksStructural Hallucination—producing semantically plausible but non\-existent keys \(e\.g\., “Concept X”∉𝒮\\notin\\mathcal\{S\}\), leading to lookup failures\.\(c\)SCG\-Mem:Constrains LLM decoding via aCognitive Schema\(Prefix Trie\), guaranteeing that all generated keys are valid \(k^∈𝒮\\hat\{k\}\\in\\mathcal\{S\}\)\. TheAssociative Graphthen enables multi\-hop traversal to gather contextually relevant neighbors\.![Refer to caption](https://arxiv.org/html/2604.20117v1/x2.png)Figure 2:The SCG\-Mem Framework\.\(A\) Evolutionary Schema Construction:New dialogue turns are processed via dual pathways—Assimilationgrounds inputs to existing schema nodes through constrained decoding, whileAccommodationexpands the Prefix Trie with novel concepts via free generation\.\(B\) Relational Topology Construction:Co\-occurring concepts within each turn are linked in anAssociative Graph, with edge weights computed by accumulated IDF products to capture semantic coupling strength\.\(C\) Constructive Recall:Given a query, the system first activates seed concepts via schema\-constrained decoding \(guaranteeingk^∈𝒮\\hat\{k\}\\in\\mathcal\{S\}\), then performs associative propagation over the graph to gather contextually relevant memory entries for response generation\.## 1Introduction

Long\-term memory is a fundamental capability for autonomous agents, enabling coherent reasoning, personalization, and temporal consistency across extended interactions\. In recent years, architectures such as MemGPTPackeret al\.\([2023](https://arxiv.org/html/2604.20117#bib.bib13)\)and RAG\-based systems have proliferatedZhonget al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib12)\); Sarthiet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib19)\); Leeet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib11)\)\. Despite their differences, these systems share a common empiricist assumption: memory access is adiscriminative retrievalproblem\. Given a query, the agent searches an external vector store to retrieve a subset of candidate entries based on approximate similarity\.

While effective in short\-horizon settings, this retrieval\-centric paradigm exhibits persistent limitations\. First, dense retrieval relies on identifying entities within sentences, yet identical entities frequently recur across different contexts\. Since semantic similarity is not equivalent to contextual relevance, embeddings often fail to distinguish semantically identical but contextually distinct instancesXuet al\.\([2025a](https://arxiv.org/html/2604.20117#bib.bib32)\)\. Second, most retrieval indices are topologically flat, lacking the relational structure required for associative multi\-hop reasoning\. Although recent attempts introduce graph structuresEdgeet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib20)\); Rezazadehet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib18)\), they still rely on dense retrieval to select initial entry nodes, inheriting the same noise problem\.

A natural alternative is to reformulate recall asgenerative reconstructionLiet al\.\([2025c](https://arxiv.org/html/2604.20117#bib.bib28)\)to exploit the world knowledge and memory understanding capabilities of Large Language Models \(LLMs\)\. Memory access typically necessitates the construction of indices, such as inverted indices or graphs, where accessing content requires a specifickey\. However, allowing Large Language Models to directly generate these keys often results in the production of keys that do not exist in the memory\. This is a form of hallucinationHuanget al\.\([2025](https://arxiv.org/html/2604.20117#bib.bib33)\), we define this phenomenon asStructural Hallucination\. Such structural hallucinations are catastrophic, as they inevitably lead to lookup failures by pointing to non\-existent entries\.

Inspired by Piaget’s constructivist epistemologyPiaget \([1970](https://arxiv.org/html/2604.20117#bib.bib1),[1952](https://arxiv.org/html/2604.20117#bib.bib9)\), we recognize that human memory is structured by cognitive schemas—dynamic mental templates for understanding the world—and the act of recollection is a reconstructive process constrained by these existing structures\. Building on this insight, we proposeSCG\-Mem\(Schema\-Constrained Generative Memory\), a paradigm shift that reformulates memory access from external retrieval toSchema\-Constrained Generation\(Figure[1](https://arxiv.org/html/2604.20117#S0.F1)\)\.

To operationalize this constructivist framework, we shift memory representation from continuous vectors to a discrete, schema\-grounded construction process, where valid recall is confined by an internalizedCognitive Schema\. The architecture ofSCG\-Memis built upon three synergistic components that collectively govern the existence, evolution, and association of memory\. We first distill raw memory entries into discreteConcepts\(keywords\)\. Collectively, these concepts form the agent’sCognitive Schema, representing the epistemic boundary of its valid knowledge\. Technically, we structure this schema as a dynamicPrefix Trie\. During recollection, this Trie functions as a hard constraint on the LLM’s decoding process, ensuring that the agent only generates retrieval keys that correspond to valid paths within the schema\. By strictly confining the search space to the schema, we mathematically preclude structural hallucinations, guaranteeing that every generated key maps to a valid memory entry\.

Crucially, this schema is not static but evolves over time\. We model the temporal dynamics throughEvolutionary Schema Construction\. Following Piagetian dynamics, the system continuously updates the Trie viaassimilation\(grounding new inputs into existing paths\) andaccommodation\(expanding the Trie with novel concepts\)\. Finally, to capture the semantic relationships between these dynamically evolved concepts, we overlay anAssociative Graph\. This transforms the schema from a discrete lexicon into a navigable cognitive map, enabling the agent to traverse associative pathways beyond explicit query matches\.

We evaluateSCG\-Memon the LoCoMo benchmark\. Experiments show that our generative\-constructivist approach yields consistent and substantial improvements over retrieval\-based baselines across evaluation categories\.

In summary, this work makes the following contributions:

- •We proposeSCG\-Mem, a novel memory architecture that reformulates retrieval asschema\-constrained generation\. By introducing constrained decoding within a valid schema, we effectively eliminate structural hallucinations\.
- •We introduce anAssociative Graphover a Prefix\-Trie\-based Schema\. This hybrid structure enforces key validity via the Trie while enabling associative reasoning via graph traversal\.
- •We design anEvolutionary Schema Constructionmechanism \(Assimilation and Accommodation\) that enables stable yet adaptive long\-term memory growth\.

## 2Related Work

### 2\.1Agent Memory

Research on agent memory has explored various architectures to optimize memory organization and access\. Early systems like MemoryBankZhonget al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib12)\)and MemGPTPackeret al\.\([2023](https://arxiv.org/html/2604.20117#bib.bib13)\)fragment texts into chunks managed via dense retrieval or cache\-like tiers, while ReadAgentLeeet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib11)\)utilizes gist compression for interactive lookups\. To support higher\-level reasoning, frameworks such as RAPTORSarthiet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib19)\)and GraphRAGEdgeet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib20)\)structure data into recursive trees or knowledge graphs; however, these are often confined to static corpora, necessitating costly reconstruction for updates\. Recent dynamic approaches like MemTreeRezazadehet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib18)\)and CAMLiet al\.\([2025b](https://arxiv.org/html/2604.20117#bib.bib16)\)adopt Piagetian\-inspired tree structures for online clustering\. Yet, despite these structural advances, they fundamentally rely on discriminative retrieval for initial access, which remains vulnerable to noise from contextually irrelevant but semantically similar vectors\. Distinctively,SCG\-Memdeparts from this retrieval paradigm by introducing schema\-constrained decoding, enabling the agent to constructively accurate recall multiple entries in a single generative pass guided by its internalized cognitive schema\.

### 2\.2Constraint Decoding

Constrained decoding modifies the probability distribution of a language model during inference to satisfy external constraints\. Initial applications focused on lexical constraints, such as Grid Beam SearchHokamp and Liu \([2017](https://arxiv.org/html/2604.20117#bib.bib21)\); Post and Vilar \([2018](https://arxiv.org/html/2604.20117#bib.bib22)\), to ensure specific keywords appeared in the output\. More recently, this has evolved into syntactic constraints for code generation, where systems like SynchromeshPoesiaet al\.\([2022](https://arxiv.org/html/2604.20117#bib.bib24)\)or PICARDScholaket al\.\([2021](https://arxiv.org/html/2604.20117#bib.bib25)\)enforce validity against formal grammars by masking invalid tokens at the logit level\. In the domain of generative retrieval, RetroLLMLiet al\.\([2025c](https://arxiv.org/html/2604.20117#bib.bib28)\)employs FM\-index constraints to directly generate fine\-grained evidence\. Recent advances have extended constrained decoding to graph traversal tasks\. Methods like GCRLuoet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib26)\)and DoGLiet al\.\([2025a](https://arxiv.org/html/2604.20117#bib.bib27)\)employ constraints to guide LLMs in selecting valid nodes during reasoning over knowledge graphs\. However, these approaches operate on static graphs and do not account for dynamic schema evolution\.SCG\-Memadapts this mechanism from the syntactic to thesemanticandontologicaldomain\. Instead of enforcing valid syntax, we enforce valid memory paths via a Prefix Trie\. Unlike standard constrained decoding which assumes a static constraint set \(e\.g\., a fixed grammar\), our approach introducesdynamicconstraints that evolve through the agent’s lifetime\.

## 3Methodology

We presentSCG\-Mem\(Schema\-Constrained Generative Memory\), a memory architecture grounded in the constructivist principle that knowing is an active process of construction under constraints\. Unlike traditional retrieval systems that treat memory as a static repository accessed via similarity search,SCG\-Memreformulates memory access as a generative process governed by a dynamic cognitive schema\.

As illustrated in Figure[2](https://arxiv.org/html/2604.20117#S0.F2), the framework is composed of three tightly coupled components working in synergy\. At the foundation lies aCognitive Schema\(implemented as a Prefix Trie\), which defines the epistemic boundary of the agent and strictly enforces valid memory access \(Section[3\.1](https://arxiv.org/html/2604.20117#S3.SS1)&[3\.2](https://arxiv.org/html/2604.20117#S3.SS2)\)\. Crucially, this schema is not static; we employ anEvolutionary Schema Constructionmechanism that dynamically updates the schema throughassimilationandaccommodationprocesses to ensure long\-term adaptability \(Section[3\.4](https://arxiv.org/html/2604.20117#S3.SS4)\)\. Finally, to support reasoning capabilities beyond simple validity, we overlay this evolving schema with anAssociative Graphthat enables associative multi\-hop reasoning via activation propagation \(Section[3\.5](https://arxiv.org/html/2604.20117#S3.SS5)\)\. In the following sections, we first formally define the problem of Structural Hallucination, and then detail how each component addresses it to achieve robust, evolving memory\.

### 3\.1Problem Formulation: Structural Hallucination

In the context of autonomous agents, memory access is typically modeled as a mapping from a query contextccto a memory entry keykk\. We define the agent’s cognitive schema𝒮\\mathcal\{S\}as a finite set of valid concept keys\.

###### Definition 1\(Structural Hallucination\)\.

Given a contextccand a schema𝒮\\mathcal\{S\}, a generated memory entry pointk^\\hat\{k\}is a Structural Hallucination if and only if:

k^∉𝒮\\hat\{k\}\\notin\\mathcal\{S\}\(1\)even ifk^\\hat\{k\}is semantically plausible or factually correct in a broad world knowledge context\.

Remark\.· Structural hallucination differs fundamentally from retrieval noise: retrieval systems may return irrelevant keys, but always valid ones \(k^∈𝒮\\hat\{k\}\\in\\mathcal\{S\}\), whereas generative models can produce non\-existent keys that cause lookup failures\.

Our objective is to construct a memory access mechanismPθP\_\{\\theta\}such that the probability of structural hallucination is strictly zero:

Pθ\(k^∉𝒮\|c\)=0P\_\{\\theta\}\(\\hat\{k\}\\notin\\mathcal\{S\}\|c\)=0\(2\)

### 3\.2The Cognitive Schema

To enforce the validity constraint established in Eq\. \(2\), we explicitly model the agent’s epistemic boundaries\. We operationalize the cognitive schema𝒮\\mathcal\{S\}not as a static database, but as a dynamic structural constraint implemented via aPrefix Trie𝒯\\mathcal\{T\}\.

###### Definition 2\(Cognitive Schema\)\.

The Cognitive Schema𝒮\\mathcal\{S\}is formally defined as a finite set of concept keys over the token vocabularyΣ\\Sigma:

𝒮⊂Σ∗\\mathcal\{S\}\\subset\\Sigma^\{\*\}\(3\)whereΣ∗\\Sigma^\{\*\}denotes the set of all finite token sequences\. To make this set computationally enforceable during generation, we construct a Prefix Trie𝒯\\mathcal\{T\}such that every path from the root to a marked end\-node corresponds to a valid keyk∈𝒮k\\in\\mathcal\{S\}\.

Mathematically,𝒯\\mathcal\{T\}defines the prefix\-closed validity spaceΩ𝒮\\Omega\_\{\\mathcal\{S\}\}within the universe of all token sequencesΣ∗\\Sigma^\{\*\}:

Ω𝒮=\{s∈Σ∗∣∃k∈𝒮,sis a prefix ofk\}\\Omega\_\{\\mathcal\{S\}\}=\\\{s\\in\\Sigma^\{\*\}\\mid\\exists k\\in\\mathcal\{S\},s\\text\{ is a prefix of \}k\\\}\(4\)A generated memory entryk^\\hat\{k\}satisfies the non\-hallucination condition \(Eq\. 2\) if and only ifk^∈𝒮⊂Ω𝒮\\hat\{k\}\\in\\mathcal\{S\}\\subset\\Omega\_\{\\mathcal\{S\}\}\.

Crucially, the schema functions as ahard constrainton the language model’s decoding manifold\. We define a binary validity indicator𝕀𝒮\(y1:t\)\\mathbb\{I\}\_\{\\mathcal\{S\}\}\(y\_\{1:t\}\)for any generated token sequencey1:ty\_\{1:t\}at steptt:

𝕀𝒮\(y1:t\)=\{1ify1:t∈Ω𝒮0otherwise\\mathbb\{I\}\_\{\\mathcal\{S\}\}\(y\_\{1:t\}\)=\\begin\{cases\}1&\\text\{if \}y\_\{1:t\}\\in\\Omega\_\{\\mathcal\{S\}\}\\\\ 0&\\text\{otherwise\}\\end\{cases\}\(5\)This indicator acts as a gatekeeper, pruning the probability mass of invalid tokens to zero before the softmax layer \(detailed in Sec\.[3\.3](https://arxiv.org/html/2604.20117#S3.SS3)\)\.

By adopting this architecture, we achieve a fundamental dissociation betweenexistenceandassociation:

- •Existence \(Ontology\):Managed by the Trie𝒯\\mathcal\{T\}\. It answers the question“Does this concept exist in my world?”by strictly enforcingk^∈𝒮\\hat\{k\}\\in\\mathcal\{S\}\.
- •Association \(Topology\):Managed by the Associative Graph\. It answers“How is this concept related to others?”via edge weights \(Sec\.[3\.5](https://arxiv.org/html/2604.20117#S3.SS5)\)\.

This separation guarantees that the agent’s reasoning process is strictly confined to its valid epistemic boundaries, thereby eliminating structural hallucinations by construction\.

### 3\.3Schema\-Constrained Decoding

Memory access inSCG\-Memis realized through schema\-constrained decoding\. Given a contextcc, we modify the token\-level generation process of the LLM \(parameterized byθ\\theta\) such that the probability mass is strictly confined to the validity spaceΩ𝒮\\Omega\_\{\\mathcal\{S\}\}\.

Formally, lety<ty\_\{<t\}be the generated prefix at steptt\. The schema\-constrained distributionP𝒮P\_\{\\mathcal\{S\}\}is defined as the renormalization of the base model distributionPθP\_\{\\theta\}masked by the validity indicator𝕀𝒮\\mathbb\{I\}\_\{\\mathcal\{S\}\}\(defined in Eq\. 4\):

P𝒮\(yt\|y<t,c\)=Pθ\(yt\|y<t,c\)⋅𝕀𝒮\(y<t∘yt\)Z\(y<t\)P\_\{\\mathcal\{S\}\}\(y\_\{t\}\|y\_\{<t\},c\)=\\frac\{P\_\{\\theta\}\(y\_\{t\}\|y\_\{<t\},c\)\\cdot\\mathbb\{I\}\_\{\\mathcal\{S\}\}\(y\_\{<t\}\\circ y\_\{t\}\)\}\{Z\(y\_\{<t\}\)\}\(6\)
where∘\\circdenotes string concatenation andZ\(y<t\)Z\(y\_\{<t\}\)is the normalization constant\. This mechanism enforces an*ex\-ante epistemic constraint*: during autoregressive decoding, any tokenyty\_\{t\}that results in a prefix outsideΩ𝒮\\Omega\_\{\\mathcal\{S\}\}receives zero probability\. Consequently, the agent is mathematically incapable of generating a completed keyk^\\hat\{k\}such thatk^∉𝒮\\hat\{k\}\\notin\\mathcal\{S\}\.

### 3\.4Schema Evolution: Assimilation and Accommodation

Because the schema𝒮\\mathcal\{S\}strictly constrains generation, it must distinguish between “known concepts” and “novel information\.” We model this viaEvolutionary Schema Construction, a dual\-pathway process inspired by Piagetian dynamics\. Let𝒮\(t−1\)\\mathcal\{S\}^\{\(t\-1\)\}denote the concept set at timet−1t\-1\. When a new interactiond\(t\)d^\{\(t\)\}arrives:

#### Assimilation \(Grounding\)\.

The agent first attempts to interpret the input using its existing cognitive structures\. We perform schema\-constrained generation to map the input to valid keys within the current ontology:

𝒦assim=\{k∈𝒮\(t−1\)∣k∼P𝒮\(⋅∣d\(t\)\)\}\\mathcal\{K\}\_\{\\text\{assim\}\}=\\\{k\\in\\mathcal\{S\}^\{\(t\-1\)\}\\mid k\\sim P\_\{\\mathcal\{S\}\}\(\\cdot\\mid d^\{\(t\)\}\)\\\}\(7\)Assimilation reinforces the weights of existing graph edges without altering the epistemic boundary𝒮\\mathcal\{S\}\.

#### Accommodation \(Expansion\)\.

If constrained generation fails to adequately represent the input \(e\.g\., measured by high perplexity or a specific ”unknown” token\), the system triggers accommodation\. The schema constraint is temporarily relaxed, allowing the base modelPθP\_\{\\theta\}to generate novel concepts:

𝒦nov=\{k∣k∼Pθ\(⋅∣d\(t\)\)\}∖𝒮\(t−1\)\\mathcal\{K\}\_\{\\text\{nov\}\}=\\\{k\\mid k\\sim P\_\{\\theta\}\(\\cdot\\mid d^\{\(t\)\}\)\\\}\\setminus\\mathcal\{S\}^\{\(t\-1\)\}\(8\)These novel concepts are then validated and inserted into the Trie, explicitly expanding the agent’s epistemic boundary:

𝒮\(t\)←𝒮\(t−1\)∪𝒦nov\\mathcal\{S\}^\{\(t\)\}\\leftarrow\\mathcal\{S\}^\{\(t\-1\)\}\\cup\\mathcal\{K\}\_\{\\text\{nov\}\}\(9\)This evolutionary mechanism ensures that the memory system remains open\-ended while maintaining structural consistency\.

### 3\.5Associative Reasoning within the Schema

While the Cognitive Schema𝒮\\mathcal\{S\}enforcesexistence\(validity\), it implies a flat topology\. To support multi\-hop reasoning, we overlay anAssociative Graphmodeled as a weighted undirected graph𝒢=\(V,E\)\\mathcal\{G\}=\(V,E\), where the vertex setV=𝒮V=\\mathcal\{S\}corresponds exactly to the valid concepts defined in the schema\.

We define the edge weightwuvw\_\{uv\}between two conceptsu,v∈𝒮u,v\\in\\mathcal\{S\}based on their semantic coupling strength across the interaction historyℋ\\mathcal\{H\}\. By abstracting away linear token distance, we focus purely on the information\-theoretic value of their co\-occurrence\. The weight is computed as the accumulated IDF product:

wuv=∑\(u,v\)∈ℋIDF\(u\)⋅IDF\(v\)w\_\{uv\}=\\sum\_\{\(u,v\)\\in\\mathcal\{H\}\}\\text\{IDF\}\(u\)\\cdot\\text\{IDF\}\(v\)\(10\)
where the Inverse Document Frequency is defined asIDF\(k\)=log⁡Ndf\(k\)\\text\{IDF\}\(k\)=\\log\\frac\{N\}\{df\(k\)\}, withNNdenoting the total number of dialogue turns anddf\(k\)df\(k\)the number of turns containing conceptkk\. Hereℋ\\mathcal\{H\}represents the set of co\-occurring concept pairs within the same dialogue turn\. TheIDF\(⋅\)\\text\{IDF\}\(\\cdot\)term acts as a significance filter, penalizing ubiquitous stop\-words while boosting the connection strength between rare, domain\-specific concepts\. This topology enables the propagation of activation from explicit query terms to implicitly related concepts based on accumulated semantic relevance\.

### 3\.6Constructive Recall

Recall inSCG\-Memis a constructive, dual\-stage process comprising schema grounding and activation propagation\.

#### Schema Activation\.

Given a queryqq, the agent first identifies valid entry points within the schema𝒮\\mathcal\{S\}for subsequent graph traversal\. Instead of relying on fuzzy vector retrieval, we perform schema\-constrained generation to obtain a set of seed conceptsKseedK\_\{\\text\{seed\}\}\. To decode multiple diverse yet relevant keywords, we employconstrained beam searchwith beam sizebb\. At each decoding step, we maintain the top\-bbpartial sequences ranked by cumulative log\-probability, while enforcing the schema constraint \(Eq\. 5\) to prune invalid branches\. Upon completion, this yieldsbbdistinct valid keys:

Kseed=BeamSearchb\(P𝒮\(⋅∣q\)\),\|Kseed\|=bK\_\{\\text\{seed\}\}=\\text\{BeamSearch\}\_\{b\}\(P\_\{\\mathcal\{S\}\}\(\\cdot\\mid q\)\),\\quad\|K\_\{\\text\{seed\}\}\|=b\(11\)This approach leverages the LLM’s semantic understanding to identify the most query\-relevant concepts while the Trie constraint guarantees that all returned keys are valid schema nodes\.

#### Associative Propagation\.

To recover implicit context, we propagate activation from the seed concepts to their neighbors via weighted sampling\. For each active nodeu∈Kseedu\\in K\_\{\\text\{seed\}\}, we compute the transition probability to a neighborvvby applying a softmax over the edge weightswuvw\_\{uv\}\(defined in Sec\.[3\.5](https://arxiv.org/html/2604.20117#S3.SS5)\):

P\(v∣u\)=Softmax\(wuv\)=exp⁡\(wuv/T\)∑v′∈𝒩\(u\)exp⁡\(wuv′/T\)P\(v\\mid u\)=\\text\{Softmax\}\(w\_\{uv\}\)=\\frac\{\\exp\(w\_\{uv\}/T\)\}\{\\sum\_\{v^\{\\prime\}\\in\\mathcal\{N\}\(u\)\}\\exp\(w\_\{uv^\{\\prime\}\}/T\)\}\(12\)where𝒩\(u\)\\mathcal\{N\}\(u\)denotes the neighbors ofuu, andTTis a temperature parameter controlling the breadth of association\. The set of retrieved context conceptsKcontextK\_\{\\text\{context\}\}is then obtained by sampling from this distribution:

Kcontext∼P\(⋅∣Kseed\)K\_\{\\text\{context\}\}\\sim P\(\\cdot\\mid K\_\{\\text\{seed\}\}\)\(13\)This stochastic approach allows the agent to efficiently explore the neighborhood of valid concepts, capturing relevant details that are topologically close \(high semantic coupling\) to the query\.

#### Context Reconstruction\.

To ground the agent’s reasoning in concrete evidence, we map the activated conceptual subgraph back to the original source text\. We retrieve the text segments associated with the union of seed and expanded concepts \(Kseed∪KcontextK\_\{\\text\{seed\}\}\\cup K\_\{\\text\{context\}\}\) to form the contextCC\. The final response is synthesized by the LLM, conditioned on this schema\-grounded context:r∼Pθ\(r∣C,q\)r\\sim P\_\{\\theta\}\(r\\mid C,q\)\.

## 4Experiments

To rigorously evaluateSCG\-Mem, we follow the established protocols of theLoCoMobenchmarkMaharanaet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib7)\), which represents the current state\-of\-the\-art for evaluating long\-term, multi\-session agentic memory\.

### 4\.1Dataset

We utilize the LoCoMo datasetMaharanaet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib7)\), featuring ultra\-long dialogues \(avg\. 9K tokens, up to 35 sessions\)\. We evaluate on four categories:Single\-Hop\(single\-session retrieval\),Multi\-Hop\(cross\-session synthesis\),Temporal\(time\-dependent updates\), andAdversarial\(misleading queries\), excluding Open\-Domain to focus on conversation\-grounded memory\.

### 4\.2Metrics

For evaluation, we employ two primary metrics: theF1 Scoreto assess answer accuracy by balancing precision and recall, andBLEU\-1to evaluate generated response quality by measuring word overlap with ground truth responses\.

### 4\.3Baselines

We compareSCG\-Memagainst five representative memory systems:LoCoMoMaharanaet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib7)\)directly leverages foundation models without memory mechanisms, incorporating the complete preceding conversation into the prompt for each query\.ReadAgentLeeet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib11)\)processes long\-context documents through episode pagination, memory gisting, and interactive look\-up\.MemoryBankZhonget al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib12)\)maintains historical interactions with a dynamic memory updating mechanism based on the Ebbinghaus Forgetting Curve theory\.MemGPTPackeret al\.\([2023](https://arxiv.org/html/2604.20117#bib.bib13)\)implements a dual\-tier virtual context management system inspired by OS memory hierarchies, with main context \(RAM\) and external context \(disk\)\.A\-MEMXuet al\.\([2025b](https://arxiv.org/html/2604.20117#bib.bib14)\)reformulates raw interactions into structured “notes” and employs an evolutionary mechanism to establish inter\-memory connections, using dense retrieval for initial node access and associative traversal for context gathering\.

### 4\.4Implementation Details

SinceSCG\-Memrequires direct access to token\-level probability distributions for schema\-constrained decoding, we deploy all models locally using the Hugging Face Transformers library\. We primarily evaluate onQwen2\.5 \(1\.5B, 3B\)Huiet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib31)\)andLlama 3\.2 \(1B, 3B\)Grattafioriet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib30)\), instantiated with full precision for accurate logit manipulation\. For text embedding, we utilize theBGE\-M3Chenet al\.\([2024](https://arxiv.org/html/2604.20117#bib.bib29)\)model across all experiments to construct the initial concept representations\.

Table 1:Experimental results on the LoCoMo benchmark across different foundation models\. SinceSCG\-Memrequires access to token\-level probability distributions for schema\-constrained decoding, we evaluate on open\-weights models \(Qwen 2\.5 and Llama 3\.2\)\.SCG\-Memconsistently outperforms the strongest baseline A\-MEM, demonstrating the efficacy of structural constraints even on smaller\-scale models\. \(Best results inbold\)\.
### 4\.5Main Results

Table[1](https://arxiv.org/html/2604.20117#S4.T1)showsSCG\-Memconsistently outperforms the strongest baseline A\-MEM across all categories on both Qwen 2\.5 and Llama 3\.2 models, with notable gains exceeding100%in specific tasks \(e\.g\.,\+146\.7%F1 inSingle\-Hopon Qwen2\.5 3B\)\. Specifically,Multi\-Hopreasoning sees up to\+126\.6%F1 improvement on Qwen2\.5 3B, validating that our hybrid architecture—combining Trie\-based constrained decoding with Associative Graph traversal—successfully bridges disjoint contexts that challenge standard retrieval\. Concurrently,Temporaltasks show substantial gains \(e\.g\.,\+111\.5%F1 on Llama 3\.2 1B\), confirming that the assimilation\-accommodation mechanism effectively updates epistemic boundaries to track evolving facts\. Finally, inAdversarialsettings—deliberately designed to mislead with traps and uncertainty—SCG\-Memdemonstrates that such queries can be accurately mapped to relevant memories through semantic mapping, boosting performance by up to\+88\.6%F1 on Qwen2\.5 3B\.

### 4\.6Ablation Study

To analyze the contribution of each component, we conduct an ablation study using Qwen 2\.5 3B as the foundation model across all four task categories \(Table[2](https://arxiv.org/html/2604.20117#S4.T2)\)\.

#### w/o Cognitive Constraint\.

We replace schema\-constrained decoding with unconstrained generation, then concatenate the generated keywords into a single string and use its embedding to retrieve seed concepts from the Cognitive Schema via dense similarity matching\. Results show severe degradation, notablyMulti\-HopF1 dropping by\-39\.5%\. This confirms that accurate seed concept selection is critical for effective memory recall\.

#### w/o Evolutionary Update\.

We disable assimilation \(grounding to existing schema nodes\) and retain only accommodation \(generating novel concepts\)\. This inability to connect new information with existing knowledge causes substantial drops:Multi\-HopF1 decreases by\-34\.2%andTemporalby\-20\.1%\. This highlights that linking new inputs to existing epistemic structures is essential for bridging disjoint contexts and tracking evolving facts\.

Table 2:Ablation study on key components across all four task categories\. We evaluate the impact of removing the schema constraint and the evolutionary mechanism\. The results confirm that each component is essential for its specific target capability \(e\.g\., Schema for Adversarial, Evolution for Temporal\)\.

### 4\.7Hyperparameter Sensitivity

We examine the impact of two critical hyperparameters: the number of retrieved conceptskkfor context reconstruction, and the depth of associative propagation \(hops\) on the graph topology\.

![Refer to caption](https://arxiv.org/html/2604.20117v1/x3.png)

\(a\) Multi\-Hop

![Refer to caption](https://arxiv.org/html/2604.20117v1/x4.png)

\(b\) Temporal

![Refer to caption](https://arxiv.org/html/2604.20117v1/x5.png)

\(c\) Single\-Hop

![Refer to caption](https://arxiv.org/html/2604.20117v1/x6.png)

\(d\) Adversarial

Figure 3:Impact of Retrieval Sizekk\.Performance exhibits a consistent rise\-then\-fall pattern across all categories, peaking aroundk=35k=35\. Insufficient retrieval \(k<20k<20\) misses relevant context, while excessive retrieval \(k\>40k\>40\) introduces noise that degrades reasoning precision\.#### Impact of Retrieval Size \(kk\)\.

As illustrated in Figure[3](https://arxiv.org/html/2604.20117#S4.F3), performance follows a characteristic inverted\-U curve across all task categories, peaking aroundk=35k=35\. In the ascending phase \(k<35k<35\), increasing retrieval size progressively enriches the context with relevant concepts, enabling more comprehensive reasoning\. Beyond the optimal point, performance gradually declines as excessive retrieval introduces semantically related but contextually irrelevant information, which dilutes the signal and impairs the LLM’s reasoning precision\. Notably, complex tasks \(Multi\-Hop and Temporal\) exhibit steeper gains during the ascending phase, reflecting their greater dependence on sufficient contextual coverage\.

#### Impact of Association Hops\.

Figure[4](https://arxiv.org/html/2604.20117#S4.F4)reveals a consistent inverted\-V pattern across all task categories\. Hop\-0 retrieval, which relies solely on directly matched seed concepts via the Schema, yields the lowest performance—particularly on Multi\-Hop tasks where information is scattered across disjoint sessions\. Extending to hop\-1 substantially improves all categories by recovering implicitly connected concepts through one\-step graph traversal\. However, hop\-2 propagation universally degrades performance, with Temporal reasoning showing the steepest decline\. This suggests that while shallow associative propagation effectively bridges semantic gaps, deeper traversal introduces semantically drifted concepts that dilute the context and impair reasoning precision\. The optimal hop depth of 1 reflects a balance between coverage and noise\.

![Refer to caption](https://arxiv.org/html/2604.20117v1/x7.png)

\(a\) Multi\-Hop

![Refer to caption](https://arxiv.org/html/2604.20117v1/x8.png)

\(b\) Temporal

![Refer to caption](https://arxiv.org/html/2604.20117v1/x9.png)

\(c\) Single\-Hop

![Refer to caption](https://arxiv.org/html/2604.20117v1/x10.png)

\(d\) Adversarial

Figure 4:Impact of Hop Count across Categories\.Performance consistently peaks at hop\-1 across all categories, demonstrating the critical value of one\-step associative propagation\. Multi\-Hop exhibits the largest relative gain from hop\-0 to hop\-1, as direct schema matching alone cannot bridge disjoint conversation sessions\. However, hop\-2 uniformly degrades performance, with Temporal reasoning suffering the most severe decline, indicating that excessive propagation introduces semantically drifted noise\.

## 5Conclusion and Future Work

In this paper, we proposedSCG\-Mem, a novel memory architecture that reformulates memory access from discriminative retrieval toschema\-constrained generation\. By maintaining a dynamic Prefix Trie as the cognitive schema and constraining LLM decoding to generate only valid memory entry keys, we provide a formal guarantee that eliminates structural hallucinations by construction\. To support long\-term adaptation, we model memory updates via Piagetian assimilation \(grounding into existing schema\) and accommodation \(schema expansion with novel concepts\)\. Furthermore, we construct an Associative Graph over the schema and perform activation propagation for multi\-hop reasoning\. Experimental results on the LoCoMo benchmark demonstrate thatSCG\-Memconsistently outperforms state\-of\-the\-art baselines, particularly in multi\-hop and adversarial tasks\. Future work will explorememory compressionmechanisms during storage to reduce redundancy and improve efficiency, as well asmemory rewritingstrategies that enable the agent to consolidate and refine existing knowledge over time\. We also plan to investigatehierarchical schema structuresthat organize concepts at multiple levels of abstraction, potentially improving both retrieval efficiency and semantic coherence\. Additionally, extendingSCG\-Memtomulti\-modal settings—where the cognitive schema encompasses visual, auditory, and textual concepts—represents a promising direction for building more general\-purpose memory systems\.

## References

- J\. Chen, S\. Xiao, P\. Zhang, K\. Luo, D\. Lian, and Z\. Liu \(2024\)BGE m3\-embedding: multi\-lingual, multi\-functionality, multi\-granularity text embeddings through self\-knowledge distillation\.arXiv preprint arXiv:2402\.03216\.Cited by:[§4\.4](https://arxiv.org/html/2604.20117#S4.SS4.p1.1)\.
- D\. Edge, H\. Trinh, N\. Cheng, J\. Bradley, A\. Chao, A\. Mody, S\. Truitt, D\. Metropolitansky, R\. O\. Ness, and J\. Larson \(2024\)From local to global: a graph rag approach to query\-focused summarization\.arXiv preprint arXiv:2404\.16130\.Cited by:[§1](https://arxiv.org/html/2604.20117#S1.p2.1),[§2\.1](https://arxiv.org/html/2604.20117#S2.SS1.p1.1)\.
- A\. Grattafiori, A\. Dubey, A\. Jauhri, A\. Pandey, A\. Kadian, A\. Al\-Dahle, A\. Letman, A\. Mathur, A\. Schelten, A\. Vaughan,et al\.\(2024\)The llama 3 herd of models\.arXiv preprint arXiv:2407\.21783\.Cited by:[§4\.4](https://arxiv.org/html/2604.20117#S4.SS4.p1.1)\.
- C\. Hokamp and Q\. Liu \(2017\)Lexically constrained decoding for sequence generation using grid beam search\.arXiv preprint arXiv:1704\.07138\.Cited by:[§2\.2](https://arxiv.org/html/2604.20117#S2.SS2.p1.1)\.
- L\. Huang, W\. Yu, W\. Ma, W\. Zhong, Z\. Feng, H\. Wang, Q\. Chen, W\. Peng, X\. Feng, B\. Qin,et al\.\(2025\)A survey on hallucination in large language models: principles, taxonomy, challenges, and open questions\.ACM Transactions on Information Systems43\(2\),pp\. 1–55\.Cited by:[§1](https://arxiv.org/html/2604.20117#S1.p3.1)\.
- B\. Hui, J\. Yang, Z\. Cui, J\. Yang, D\. Liu, L\. Zhang, T\. Liu, J\. Zhang, B\. Yu, K\. Lu,et al\.\(2024\)Qwen2\.5\-coder technical report\.arXiv preprint arXiv:2409\.12186\.Cited by:[§4\.4](https://arxiv.org/html/2604.20117#S4.SS4.p1.1)\.
- K\. Lee, X\. Chen, H\. Furuta, J\. Canny, and I\. Fischer \(2024\)A human\-inspired reading agent with gist memory of very long contexts\.arXiv preprint arXiv:2402\.09727\.Cited by:[§1](https://arxiv.org/html/2604.20117#S1.p1.1),[§2\.1](https://arxiv.org/html/2604.20117#S2.SS1.p1.1),[§4\.3](https://arxiv.org/html/2604.20117#S4.SS3.p1.1)\.
- K\. Li, T\. Zhang, X\. Wu, H\. Luo, J\. Glass, and H\. Meng \(2025a\)Decoding on graphs: faithful and sound reasoning on knowledge graphs through generation of well\-formed chains\.InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\),pp\. 24349–24364\.Cited by:[§2\.2](https://arxiv.org/html/2604.20117#S2.SS2.p1.1)\.
- R\. Li, Z\. Zhang, X\. Bo, Z\. Tian, X\. Chen, Q\. Dai, Z\. Dong, and R\. Tang \(2025b\)Cam: a constructivist view of agentic memory for llm\-based reading comprehension\.arXiv preprint arXiv:2510\.05520\.Cited by:[§2\.1](https://arxiv.org/html/2604.20117#S2.SS1.p1.1)\.
- X\. Li, J\. Jin, Y\. Zhou, Y\. Wu, Z\. Li, Y\. Qi, and Z\. Dou \(2025c\)Retrollm: empowering large language models to retrieve fine\-grained evidence within generation\.InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics \(Volume 1: Long Papers\),pp\. 16754–16779\.Cited by:[§1](https://arxiv.org/html/2604.20117#S1.p3.1),[§2\.2](https://arxiv.org/html/2604.20117#S2.SS2.p1.1)\.
- L\. Luo, Z\. Zhao, G\. Haffari, Y\. Li, C\. Gong, and S\. Pan \(2024\)Graph\-constrained reasoning: faithful reasoning on knowledge graphs with large language models\.arXiv preprint arXiv:2410\.13080\.Cited by:[§2\.2](https://arxiv.org/html/2604.20117#S2.SS2.p1.1)\.
- A\. Maharana, D\. Lee, S\. Tulyakov, M\. Bansal, F\. Barbieri, and Y\. Fang \(2024\)Evaluating very long\-term conversational memory of llm agents\.arXiv preprint arXiv:2402\.17753\.Cited by:[§4\.1](https://arxiv.org/html/2604.20117#S4.SS1.p1.1),[§4\.3](https://arxiv.org/html/2604.20117#S4.SS3.p1.1),[§4](https://arxiv.org/html/2604.20117#S4.p1.1)\.
- C\. Packer, V\. Fang, S\. G\. Patil, K\. Lin, S\. Wooders, and J\. E\. Gonzalez \(2023\)MemGPT: towards llms as operating systems\.\.Cited by:[§1](https://arxiv.org/html/2604.20117#S1.p1.1),[§2\.1](https://arxiv.org/html/2604.20117#S2.SS1.p1.1),[§4\.3](https://arxiv.org/html/2604.20117#S4.SS3.p1.1)\.
- J\. Piaget \(1952\)The origins of intelligence in children\.International Universities Press\.Cited by:[§1](https://arxiv.org/html/2604.20117#S1.p4.1)\.
- J\. Piaget \(1970\)Genetic epistemology\.Columbia University Press,New York\.Note:The core philosophical basis: ”To know is to construct”\.Cited by:[§1](https://arxiv.org/html/2604.20117#S1.p4.1)\.
- G\. Poesia, O\. Polozov, V\. Le, A\. Tiwari, G\. Soares, C\. Meek, and S\. Gulwani \(2022\)Synchromesh: reliable code generation from pre\-trained language models\.arXiv preprint arXiv:2201\.11227\.Cited by:[§2\.2](https://arxiv.org/html/2604.20117#S2.SS2.p1.1)\.
- M\. Post and D\. Vilar \(2018\)Fast lexically constrained decoding with dynamic beam allocation for neural machine translation\.arXiv preprint arXiv:1804\.06609\.Cited by:[§2\.2](https://arxiv.org/html/2604.20117#S2.SS2.p1.1)\.
- A\. Rezazadeh, Z\. Li, W\. Wei, and Y\. Bao \(2024\)From isolated conversations to hierarchical schemas: dynamic tree memory representation for llms\.arXiv preprint arXiv:2410\.14052\.Cited by:[§1](https://arxiv.org/html/2604.20117#S1.p2.1),[§2\.1](https://arxiv.org/html/2604.20117#S2.SS1.p1.1)\.
- P\. Sarthi, S\. Abdullah, A\. Tuli, S\. Khanna, A\. Goldie, and C\. D\. Manning \(2024\)Raptor: recursive abstractive processing for tree\-organized retrieval\.InThe Twelfth International Conference on Learning Representations,Cited by:[§1](https://arxiv.org/html/2604.20117#S1.p1.1),[§2\.1](https://arxiv.org/html/2604.20117#S2.SS1.p1.1)\.
- T\. Scholak, N\. Schucher, and D\. Bahdanau \(2021\)PICARD: parsing incrementally for constrained auto\-regressive decoding from language models\.arXiv preprint arXiv:2109\.05093\.Cited by:[§2\.2](https://arxiv.org/html/2604.20117#S2.SS2.p1.1)\.
- L\. Xu, Z\. Su, M\. Yu, J\. Li, F\. Meng, and J\. Zhou \(2025a\)Dense retrievers can fail on simple queries: revealing the granularity dilemma of embeddings\.arXiv preprint arXiv:2506\.08592\.Cited by:[§1](https://arxiv.org/html/2604.20117#S1.p2.1)\.
- W\. Xu, Z\. Liang, K\. Mei, H\. Gao, J\. Tan, and Y\. Zhang \(2025b\)A\-mem: agentic memory for llm agents\.arXiv preprint arXiv:2502\.12110\.Cited by:[§4\.3](https://arxiv.org/html/2604.20117#S4.SS3.p1.1)\.
- W\. Zhong, L\. Guo, Q\. Gao, H\. Ye, and Y\. Wang \(2024\)Memorybank: enhancing large language models with long\-term memory\.InProceedings of the AAAI Conference on Artificial Intelligence,Vol\.38,pp\. 19724–19731\.Cited by:[§1](https://arxiv.org/html/2604.20117#S1.p1.1),[§2\.1](https://arxiv.org/html/2604.20117#S2.SS1.p1.1),[§4\.3](https://arxiv.org/html/2604.20117#S4.SS3.p1.1)\.
To Know is to Construct: Schema-Constrained Generation for Agent Memory

Similar Articles

Cognis: Context-Aware Memory for Conversational AI Agents

Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory

HeLa-Mem: Hebbian Learning and Associative Memory for LLM Agents

Belief Memory: Agent Memory Under Partial Observability

rohitg00/agentmemory

Submit Feedback

Similar Articles

Cognis: Context-Aware Memory for Conversational AI Agents
Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory
HeLa-Mem: Hebbian Learning and Associative Memory for LLM Agents
Belief Memory: Agent Memory Under Partial Observability