2d-strategy

#2d-strategy

Two-dimensional early exit optimisation of LLM inference

arXiv cs.CL ↗ · 2026-04-22 Cached

Authors propose a 2D early-exit method that jointly trims layers and input sentences, yielding 1.4–2.3× extra speed-up on sentiment tasks across Llama 3.1/3.2, Gemma and Qwen models.

0 favorites 0 likes

2d-strategy

Two-dimensional early exit optimisation of LLM inference

Submit Feedback