@tli104: New paper: "Self-Compacting Language Model Agents" LM agents build up long traces of reasoning and tool calls. As the t…
Summary
New paper proposes self-compacting language model agents that can decide when to clean up their own traces of reasoning and tool calls to avoid accumulating mistakes and stale information.
View Cached Full Text
Cached at: 06/24/26, 02:25 PM
🚨 New paper: “Self-Compacting Language Model Agents”
LM agents build up long traces of reasoning and tool calls. As the trace grows, old mistakes and stale info stick around and anchor everything that follows. We ask: can the model itself decide when to clean up?
Similar Articles
Self-Compacting Language Model Agents
SelfCompact is a scaffolding approach that lets language models autonomously decide when and how to compact long agent traces, achieving better performance with reduced token costs compared to fixed-interval methods.
@omarsar0: Language models need "sleep"
A paper explores letting language model agents 'sleep' to reset internal state and improve performance on long-horizon tasks, addressing context length scaling issues.
@dair_ai: Great paper on long-term memory for LLM agents. (bookmark it) Coarse summaries drift and unconstrained updates corrupt,…
AtomMem introduces a long-term memory system for LLM agents that uses atomic facts as efficient memory units, organizing them into hierarchical event structures and temporal user profiles, achieving state-of-the-art on the LoCoMo benchmark.
Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories
This paper introduces a 'Sleep' paradigm for large language models that enables continual learning through memory consolidation and dreaming phases, allowing models to distill short-term knowledge into long-term parameters and self-improve without human supervision.
PACE: Two-Timescale Self-Evolution for Small Language Model Agents
PACE introduces a two-timescale framework for self-evolution of small language model agents, coordinating low-risk prompt refinement with higher-risk control-logic updates, achieving up to +9.2% relative improvement across benchmarks.