Tag
CAPS introduces a cascaded adaptive selection framework for efficient parallel reasoning, reducing verifier compute costs by over 75% while outperforming existing pairwise verification methods across multiple LLM benchmarks.
HeavySkill is a new framework that internalizes complex reasoning as an intrinsic model skill through parallel reasoning and summarization stages, outperforming traditional orchestration methods and enabling self-evolving LLMs via reinforcement learning.
STOP method prunes doomed reasoning trajectories early via KV-cache states, cutting token usage 70% and boosting AIME/GPQA accuracy across 1.5B–20B models.
This paper proposes STOP (SuperTOken for Pruning), a systematic framework for pruning inefficient reasoning paths early in parallel reasoning with Large Reasoning Models. The method achieves superior efficiency and effectiveness across models from 1.5B to 20B parameters, boosting GPT-OSS-20B accuracy on AIME25 from 84% to 90% under fixed compute budgets.