boosting

Tag

Cards List
#boosting

@dair_ai: NEW paper worth reading. GPT-5.4 nano plus a critic-comparator orchestration loop hits 76.4% on SWE-bench Verified, mat…

X AI KOLs Following · 2026-05-18 Cached

A new paper shows that using a weak model with k=8 proposals and a critic-comparator selection loop can match frontier model performance on SWE-bench Verified, reaching 76.4% accuracy. The key insight is that correct patches are often already present in a weak model's top-k candidates, and the challenge is effective selection using execution verification.

0 favorites 0 likes
#boosting

Agentic Systems as Boosting Weak Reasoning Models

arXiv cs.AI · 2026-05-15 Cached

This paper studies verifier-backed committee search as inference-time boosting for reasoning language models, showing that a committee of weak reasoning models can match the performance of much stronger models on code repair tasks like SWE-bench Verified.

0 favorites 0 likes
← Back to home

Submit Feedback