Tag
Researchers from Alibaba/Qwen and Peking University introduce TMEM, a self-evolving parametric memory framework that uses online LoRA weight updates to let LLM agents genuinely learn from experience within a single episode, rather than relying solely on prompt-space memory. TMEM outperforms summary-based and retrieval-based baselines across multiple benchmarks including LoCoMo, LongMemEval-S, and CL-Bench.
This paper investigates the quantitative limits of parametric memory in LLMs using LoRA as a probe, establishing a power law relationship and introducing a threshold-guided optimization method called MemFT for improved memory performance.