parallel-reasoning

#parallel-reasoning

CAPS: Cascaded Adaptive Pairwise Selection for Efficient Parallel Reasoning

arXiv cs.AI ↗ · 2026-05-18 Cached

CAPS introduces a cascaded adaptive selection framework for efficient parallel reasoning, reducing verifier compute costs by over 75% while outperforming existing pairwise verification methods across multiple LLM benchmarks.

0 favorites 0 likes

#parallel-reasoning

HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness

Papers with Code Trending ↗ · 2026-05-04 Cached

HeavySkill is a new framework that internalizes complex reasoning as an intrinsic model skill through parallel reasoning and summarization stages, outperforming traditional orchestration methods and enabling self-evolving LLMs via reinforcement learning.

0 favorites 0 likes

#parallel-reasoning

@HuggingPapers: Cut your losses in parallel reasoning STOP learns to prune doomed trajectories early by reading KV-cache states, cuttin…

X AI KOLs Timeline ↗ · 2026-04-21 Cached

STOP method prunes doomed reasoning trajectories early via KV-cache states, cutting token usage 70% and boosting AIME/GPQA accuracy across 1.5B–20B models.

0 favorites 0 likes

#parallel-reasoning

Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning

arXiv cs.CL ↗ · 2026-04-20 Cached

This paper proposes STOP (SuperTOken for Pruning), a systematic framework for pruning inefficient reasoning paths early in parallel reasoning with Large Reasoning Models. The method achieves superior efficiency and effectiveness across models from 1.5B to 20B parameters, boosting GPT-OSS-20B accuracy on AIME25 from 84% to 90% under fixed compute budgets.

0 favorites 0 likes

parallel-reasoning

CAPS: Cascaded Adaptive Pairwise Selection for Efficient Parallel Reasoning

HeavySkill: Heavy Thinking as the Inner Skill in Agentic Harness

@HuggingPapers: Cut your losses in parallel reasoning STOP learns to prune doomed trajectories early by reading KV-cache states, cuttin…

Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning

Submit Feedback