reasoning-agents

Tag

Cards List
#reasoning-agents

RICE-PO: Turning Retrieval Interactions into Credit Signals for Reasoning Agents

arXiv cs.CL · 2026-05-27 Cached

RICE-PO is a critic-free policy optimization framework that turns retrieval interactions into localized credit signals for training reasoning agents, outperforming prompt-based and group-based RL baselines on BRIGHT and BEIR benchmarks.

0 favorites 0 likes
#reasoning-agents

Co-ReAct: Rubrics as Step-Level Collaborators for ReAct Agents

arXiv cs.AI · 2026-05-25 Cached

Co-ReAct introduces a rubric-guided action-selection framework that uses rubrics as step-level guidance during inference for ReAct agents, improving trajectory quality and outperforming baselines on DeepResearchBench and SQA-CS-V2.

0 favorites 0 likes
← Back to home

Submit Feedback