knowledge-offloading

#knowledge-offloading

Knowledge Offloading: Decomposing LLMs into Sparse Backbones and Memory Modules

arXiv cs.LG ↗ · 2026-05-29 Cached

Proposes KOFF, a framework that decomposes pretrained LLMs into a sparse shared backbone and domain-specific external memories using structured pruning and LoRA adapters, achieving 12% sparsity without significant performance loss.

0 favorites 0 likes

knowledge-offloading

Knowledge Offloading: Decomposing LLMs into Sparse Backbones and Memory Modules

Submit Feedback