Tag
NVIDIA's ASPIRE framework enables robots to build a persistent library of skills from successful experiences, allowing reuse for new tasks and improving learning efficiency over time.
Introduces a novel cognitive architecture that learns without backpropagation, GPUs, or forgetting, mimicking biological learning.
MLUBench is a large-scale benchmark for lifelong unlearning in multimodal large language models (MLLMs), featuring 127 entities across 9 classes. The paper identifies that existing unlearning methods suffer from cumulative degradation and proposes LUMoE to mitigate this, showing significant improvements.
This paper proposes RidgeFT, a lightweight analytic update framework for lifelong machine-generated text attribution that adapts to new text generators without forgetting old ones, achieving strong performance across multiple evaluation settings.
SOLAR proposes a self-optimizing autonomous agent that leverages parameter-level meta-learning and multi-level reinforcement learning to enable lifelong adaptation of LLMs to non-stationary data streams, outperforming baselines on reasoning tasks.
LiSA (Lifelong Safety Adaptation) is a framework that enhances AI agent safety guardrails by converting occasional failures into reusable policy abstractions and using evidence-aware confidence gating to perform well under sparse and noisy feedback, addressing the critical need for adaptive safety in real-world deployments.
Preprint SEAL outlines how future language models could self-update post-deployment, hinting GPT-6 might exhibit computational life via evolving internal states.
CobwebTM is a low-parameter lifelong hierarchical topic modeling approach that adapts the Cobweb algorithm to continuous document embeddings, enabling unsupervised topic discovery and dynamic hierarchical organization without predefining topic counts. The method combines incremental symbolic concept formation with pretrained representations to achieve strong topic coherence while avoiding catastrophic forgetting.
SkillFlow introduces a benchmark of 166 tasks across 20 families for evaluating autonomous agents' ability to discover, repair, and maintain skills over time through a lifelong learning protocol. Experiments reveal a substantial capability gap among leading models, with Claude Opus 4.6 improving significantly while others show limited or negative gains from skill evolution.