Tag
This paper introduces Program-of-Layers (PoLar), a method that allows LLMs to dynamically skip or loop pretrained layers per input, improving accuracy and efficiency over fixed-depth inference.