kv-caching

#kv-caching

KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs

Hugging Face Daily Papers ↗ · 2026-04-14 Cached

KV Packet proposes a recomputation-free cache reuse framework for LLMs that uses trainable soft-token adapters to bridge context discontinuities, eliminating overhead while maintaining performance comparable to full recomputation baselines on Llama-3.1 and Qwen2.5.

0 favorites 0 likes

#kv-caching

jundot/omlx

GitHub Trending (daily) ↗ · 4d ago Cached

oMLX is a new open-source tool for optimized LLM inference on Apple Silicon Macs, featuring continuous batching and tiered KV caching managed via a menu bar app.

0 favorites 0 likes

kv-caching

KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs

jundot/omlx

Submit Feedback