Tag
EfficientRollout is a system-aware self-speculative decoding framework that accelerates reinforcement learning rollouts for LLMs by adapting drafters to evolving policies and optimizing speculative decoding regimes, reducing latency by up to 19.6%.