Tag
OmniISR proposes a unified framework combining centralized and federated learning via intermediate supervision and regularization at hidden layers, offering theoretical convergence guarantees and reducing the CL–FL gap by 22.60%.
This paper introduces a method using knowledge-graph paths as intermediate supervision to improve self-evolving search agents. It addresses bottlenecks in Search Self-Play by grounding question construction in relational context and introducing a Waypoint Coverage Reward for graded partial credit.