multi-hop-qa

Tag

Cards List
#multi-hop-qa

Knowledge-Graph Paths as Intermediate Supervision for Self-Evolving Search Agents

arXiv cs.AI · yesterday Cached

This paper introduces a method using knowledge-graph paths as intermediate supervision to improve self-evolving search agents. It addresses bottlenecks in Search Self-Play by grounding question construction in relational context and introducing a Waypoint Coverage Reward for graded partial credit.

0 favorites 0 likes
#multi-hop-qa

Inference-Time Budget Control for LLM Search Agents

arXiv cs.AI · yesterday Cached

This paper introduces a two-stage inference-time budget control method for LLM search agents, using Value-of-Information scores to optimize tool-call and token allocation during multi-hop question answering.

0 favorites 0 likes
#multi-hop-qa

OThink-SRR1: Search, Refine and Reasoning with Reinforced Learning for Large Language Models

arXiv cs.CL · 2026-04-23 Cached

OThink-SRR1 introduces an iterative Search-Refine-Reason framework trained with GRPO-IR reinforcement learning to reduce retrieval noise and token costs while boosting multi-hop QA accuracy.

0 favorites 0 likes
← Back to home

Submit Feedback