Tag
Describes improving agentic memory search by incorporating grep-based exact matching alongside vector embeddings, inspired by a paper; achieved significant recall gains in their memory layer.
This paper introduces PersonaDrive, a pipeline that conditions a vision-language-action (VLA) driving agent on retrieved demonstrations from a style-instructed human driving dataset, enabling style-diverse non-ego agents for closed-loop simulation and improving driving scores on Bench2Drive.
This paper introduces Engram, an open-source bi-temporal memory engine for LLM agents that retrieves a compact context slice (∼9.6k tokens) to outperform the full-history baseline (79k tokens) by 10.4 accuracy points on LongMemEval, using a hybrid read path fusing dense, lexical, graph, and temporal signals.
A community discussion on agent memory reveals that while various patches exist for what to write down (e.g., plain files, layered memory, post-mortems), the unsolved problem is what to keep—detecting failures is tractable, but deciding which lessons persist still needs human judgment.
This paper introduces a four-condition diagnostic protocol to separate no-evidence answerability, oracle-evidence recoverability, full-context utilization, and retrieval-conditioned utilization in long-context and retrieval-augmented language models, tested on five open-weight models across multiple datasets.
QueryAgent-R1 is an agentic framework that bridges query generation and product retrieval in e-commerce using reinforcement learning and memory abstraction, improving query CTR by 2.9% and CVR by 3.1% in online tests.
This paper proposes SGDR (State-Grounded Dynamic Retrieval), an online skill learning method for web agents that enables stepwise, state-aware skill reuse rather than static task-level retrieval. Experiments on WebArena show SGDR achieves 37.5% success rate with GPT-4.1, a ~10.6% relative gain over strong baselines.
Mnemo is an open-source, local-first memory layer for any LLM that extracts entities and relationships into a persistent knowledge graph using SQLite and petgraph, providing automatic context injection for enhanced conversations.
A curated list of top models, engines, libraries, and datasets for late-interaction multivector retrieval, organized in an 'Awesome Multivector Retrieval' resource.
Proposes SENSE, a semantic embedding navigation method for retrieval-based speculative decoding that uses hidden states for semantic alignment and soft-gated evaluation, achieving up to 3.26x speedup on LLaMA and Qwen families while preserving generation quality.
ExpGraph is a model-agnostic framework that enables LLM agents to reuse past experiences via a self-evolving graph of skills and failures, improving task performance by 12–21% without retraining the executor.
This paper presents an alternative architecture for LLMs using Radial Basis Function (RBF) networks that eliminates deep neural networks and finds the global optimum in closed form, requiring no iterative training. It also reviews other non-DNN methods like KANs and k-NN retrieval, with a case study demonstrating increased explainability and faster training.
This paper formulates context distillation as a latent memory management problem, proposing a framework that stores distilled contexts as independent LoRA adapters with retrieval, routing, and self-gating to improve robustness and efficiency.
This paper introduces Micro-Macro Retrieval (M2R), a retrieve-while-generate framework that reduces hallucination in long-form LLM outputs by ensuring key information stays close to generated text. It uses curriculum learning-based reinforcement learning to train retrieval and grounding skills, showing effectiveness especially in lengthy contexts.
Proposes RAG4Outcome, a retrieval-augmented generation framework integrating multimodal clinical data (PET-CT reports, surgical records, follow-up notes) to improve prognostic prediction in chronic osteomyelitis, enhancing interpretability and clinical reliability.
AI memory systems often recall outdated or incorrect information over time, highlighting the challenge of maintaining trust in long-term memory for AI agents.
This paper proposes claim-selective certification for high-risk medical retrieval-augmented generation (RAG), decomposing responses into verifiable claims and scoring them against evidence to produce actions (full, partial, conflict, abstain) using an intent-aware selector, achieving low unsupported-claim risk and high action accuracy.
Introduces OGCaReBench, a free-form retrieval benchmark for evaluating LLMs on clinical questions that require reasoning beyond standard guidelines. Experiments show that even the best model achieves only 56% accuracy, but retrieval augmentation boosts performance to 82%.
University of Florida Gators submission to the AmericasNLP 2026 shared task on cultural image captioning for Indigenous languages, using a two-stage pipeline with Qwen2.5-VL for Spanish captioning and retrieval-augmented Gemini 2.5 Flash for target-language translation, achieving significant improvements over the baseline.
A systematic study on detecting Schwartz values in political text, comparing context lengths, model sizes, and retrieval-augmented generation methods. Results show that full-document context improves supervised models but not zero-shot LLMs, while retrieved moral knowledge consistently helps via early fusion.