Tag
SeKV is a resolution-adaptive KV cache method that organizes context into entropy-guided semantic spans stored across a GPU-CPU hierarchy, enabling selective token-level reconstruction during decoding while reducing GPU memory by 53.3% versus full caching at 128K context.