Tag
This paper introduces a novel retrieval loop for RAG that uses reflection tokens and on-demand retrieval, allowing the model to decide when to fetch documents or rely on internal knowledge, with critique and tree-decoding to improve accuracy.