Tag
Decoupled Mixture-of-Experts (DMoE) proposes a modular architecture for parametric knowledge injection, decoupling experts and router from the base model to enable efficient auto-regressive inference and mitigate catastrophic forgetting.
ToolSense is an open-source diagnostic framework that generates three benchmarks (realistic retrieval, MCQ probing, QA probing) to audit LLMs' parametric tool knowledge, revealing a knowledge-retrieval dissociation where strong retrieval performance can coexist with poor factual understanding.
This paper investigates whether reinforcement learning can improve the direct recall of parametric knowledge in LLMs beyond reasoning tasks. It demonstrates that RL with binary rewards yields significant gains in factual QA benchmarks by redistributing probability mass to unlock latent knowledge rather than acquiring new facts.