Tag
BELIEF is a structured evidence modeling and uncertainty-aware fusion framework for biomedical question answering that converts retrieved documents into evidence objects and combines symbolic Dempster-Shafer reasoning with LLM-based inference. Experiments on PubMedQA, MedQA, and MedMCQA show BELIEF achieves state-of-the-art results in the majority of settings.
A novel memory retrieval system inspired by episodic memory theory achieves state-of-the-art 96.4% top-50 accuracy on the LongMemEval benchmark using Gemini Flash, outperforming larger Pro-based baselines by isolating retrieval quality from model capability.
This paper evaluates six open-weight LLMs on biomedical QA under conflicting evidence conditions, revealing accuracy drops and prediction flips, and proposes a conflict-aware abstention score that improves selective accuracy.
EviMem combines IRIS for evidence-gap detection and LaceMem for layered memory to improve long-term conversational memory retrieval, achieving higher accuracy on temporal and multi-hop questions with lower latency.
CoAuthorAI is a human-in-the-loop system that combines retrieval-augmented generation and hierarchical outlines to enable accurate, coherent scientific book writing, achieving 98% recall and 82% human satisfaction in evaluations.
This paper introduces a retrieval-augmented LLM framework for financial sentiment analysis, achieving 15-48% improvement in accuracy and F1 score over traditional models and LLMs like ChatGPT and LLaMA.