Tag
This paper empirically investigates whether aligning the allocation cost with the output-space objective improves compressed model fidelity in ROCKET, a training-free LLM compression method. Results show a trade-off between accuracy and perplexity, with effects more pronounced at higher compression ratios.