kv-cache-efficiency

#kv-cache-efficiency

Scratchpad Patching: Decoupling Compute from Patch Size in Byte-Level Language Models

Hugging Face Daily Papers ↗ · 2026-05-10 Cached

This paper introduces Scratchpad Patching, a technique for tokenizer-free language models that decouples compute from patch size by dynamically refreshing context within patches to reduce patch lag.

0 favorites 0 likes

kv-cache-efficiency

Scratchpad Patching: Decoupling Compute from Patch Size in Byte-Level Language Models

Submit Feedback