early-exit

#early-exit

Balancing Stability and Plasticity in Sequentially Trained Early-Exiting Neural Networks

arXiv cs.LG ↗ · 2d ago Cached

The paper addresses catastrophic forgetting in sequentially trained early-exiting neural networks and proposes two methods based on Elastic Weight Consolidation and Learning without Forgetting to preserve earlier exit performance while adding new ones.

0 favorites 0 likes

#early-exit

Two-dimensional early exit optimisation of LLM inference

arXiv cs.CL ↗ · 2026-04-22 Cached

Authors propose a 2D early-exit method that jointly trims layers and input sentences, yielding 1.4–2.3× extra speed-up on sentiment tasks across Llama 3.1/3.2, Gemma and Qwen models.

0 favorites 0 likes

#early-exit

River-LLM: Large Language Model Seamless Exit Based on KV Share

Hugging Face Daily Papers ↗ · 2026-04-20 Cached

River-LLM proposes a training-free early-exit framework for decoder-only LLMs that uses KV-sharing to eliminate KV-cache gaps, achieving 1.71–2.16× speedup without quality loss.

0 favorites 0 likes

early-exit

Balancing Stability and Plasticity in Sequentially Trained Early-Exiting Neural Networks

Two-dimensional early exit optimisation of LLM inference

River-LLM: Large Language Model Seamless Exit Based on KV Share

Submit Feedback