early-stopping

#early-stopping

@sheriyuo: This paper proposes ASAG, Attention-State Adaptive Generation, a training-free, plug-and-play stopping framework for re…

X AI KOLs Timeline ↗ · yesterday Cached

ASAG uses attention entropy to detect when reasoning is unproductive, stopping early to improve accuracy and reduce token generation. Experiments on Qwen3-8B show a 4.4% accuracy gain and over 40% fewer generated tokens.

0 favorites 0 likes

#early-stopping

Stop When Further Reasoning Won't Help: Attention-State Adaptive Generation in Reasoning Models

arXiv cs.CL ↗ · yesterday Cached

This paper proposes ASAG, a training-free method that adaptively stops reasoning in large reasoning models based on attention distributions, reducing token usage by ~40% while improving accuracy by 3.2% on benchmarks using DeepSeek-R1-Distill and Qwen3 models.

0 favorites 0 likes

#early-stopping

MARS: Margin-Adversarial Risk-controlled Stopping for Parallel LLM Test-time Scaling

arXiv cs.AI ↗ · 5d ago Cached

This paper introduces MARS, a stopping rule for parallel LLM test-time scaling that probes partial traces to stop early without sacrificing accuracy, saving 25–47% of tokens across reasoning models on competition math benchmarks.

0 favorites 0 likes

#early-stopping

EvalStop: Using World Feedback to Detect and Correct Reward Overoptimization in Multi-Tenant RLHF Platforms

arXiv cs.LG ↗ · 2026-06-04 Cached

EvalStop is a scheduling primitive for multi-tenant RLHF platforms that detects and corrects reward overoptimization by monitoring downstream evaluation scores and terminating jobs on consecutive declines, achieving 98% precision and 99% recall while improving job completion time by 9% and cutting wasted compute by 22%.

0 favorites 0 likes

#early-stopping

ESPO: Early-Stopping Proximal Policy Optimization

Hugging Face Daily Papers ↗ · 2026-05-28 Cached

ESPO introduces an early-stopping mechanism for reinforcement learning that detects and terminates failed reasoning trajectories in LLMs, improving mathematical reasoning performance while reducing compute by over 20%.

0 favorites 0 likes

#early-stopping

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection

arXiv cs.LG ↗ · 2026-05-13 Cached

This paper introduces LEAP, a training-free method to accelerate inference in Diffusion Language Models (dLLMs) by detecting early-converging tokens, reducing denoising steps by 30% without losing accuracy.

0 favorites 0 likes

early-stopping

@sheriyuo: This paper proposes ASAG, Attention-State Adaptive Generation, a training-free, plug-and-play stopping framework for re…

Stop When Further Reasoning Won't Help: Attention-State Adaptive Generation in Reasoning Models

MARS: Margin-Adversarial Risk-controlled Stopping for Parallel LLM Test-time Scaling

EvalStop: Using World Feedback to Detect and Correct Reward Overoptimization in Multi-Tenant RLHF Platforms

ESPO: Early-Stopping Proximal Policy Optimization

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection

Submit Feedback