Active Learners as Efficient PRP Rerankers
Summary
This paper reframes pairwise ranking prompting as active learning from noisy comparisons, introducing a noise-robust framework with a randomized-direction oracle to improve ranking quality under call constraints and address position bias.
View Cached Full Text
Cached at: 05/20/26, 10:37 AM
Paper page - Active Learners as Efficient PRP Rerankers
Source: https://huggingface.co/papers/2605.14236
Abstract
Pairwise ranking prompting is reformulated as active learning from noisy comparisons, with improved rankers that enhance ranking quality under call constraints and address position bias through a randomized oracle.
Pairwise Ranking Prompting(PRP) elicits pairwise preference judgments from an LLM, which are then aggregated into a ranking, usually via classical sorting algorithms. However, judgments are noisy, order-sensitive, and sometimes intransitive, so sorting assumptions do not match the setting. Because sorting aims to recover a full permutation, truncating it to meet acall budgetdoes not produce a dependable top-K. We thus reframe PRP reranking asactive learningfromnoisy pairwise comparisonsand show that active rankers are drop-in replacements that improveNDCG@10per call in the call-constrained regime. Our noise-robust framework also introduces a randomized-direction oracle that uses a single LLM call per pair. This approach converts systematicposition biasinto zero-mean noise, enabling unbiased aggregate ranking without the cost of bidirectional calls.
View arXiv pageView PDFGitHubAdd to collection
Get this paper in your agent:
hf papers read 2605\.14236
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.14236 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.14236 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.14236 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
Active Learners as Efficient PRP Rerankers
Proposes reframing Pairwise Ranking Prompting (PRP) reranking as active learning from noisy pairwise comparisons, improving NDCG@10 per call under budget constraints, and introduces a randomized-direction oracle that reduces LLM calls per pair.
Rethinking the Necessity of Adaptive Retrieval-Augmented Generation through the Lens of Adaptive Listwise Ranking
This paper proposes AdaRankLLM, an adaptive retrieval framework that challenges the necessity of adaptive RAG by using listwise ranking to dynamically filter retrieved passages. The work shows that adaptive retrieval serves as a noise filter for weaker models while acting as a cost-efficiency optimizer for stronger models, with extensive experiments across multiple datasets and LLMs.
CurveRL: Principled Distribution-Aware Context Reweighting for LLM Reasoning
This paper introduces CurveRL, a principled distribution-aware prompt reweighting approach for reinforcement learning with verifiable rewards (RLVR) that improves LLM reasoning by assigning weights based on the rank and density of pass rates rather than their absolute values, consistently outperforming GRPO and other baselines.
Gradient-Based LoRA Rank Allocation Under GRPO: An Empirical Study
This study empirically demonstrates that gradient-based LoRA rank allocation, effective in supervised fine-tuning, degrades performance in GRPO-based reinforcement learning due to flatter gradient landscapes and a gradient amplification effect.
MemReranker: Reasoning-Aware Reranking for Agent Memory Retrieval
MemReranker is a reasoning-aware reranking model family (0.6B/4B) designed for agent memory retrieval, addressing limitations in semantic similarity by incorporating LLM knowledge distillation for better temporal and causal reasoning.