zeroth-order

#zeroth-order

Dominant-Layer ZO: A Single Layer Dominates Zeroth-Order Fine-Tuning of LLMs

arXiv cs.LG ↗ · 6d ago Cached

This paper reveals that zeroth-order fine-tuning of LLMs is dominated by a single decoding layer, which can be identified by activation outliers, and fine-tuning only that layer matches or exceeds full-model fine-tuning with up to 4.52x speedup.

0 favorites 0 likes

zeroth-order

Dominant-Layer ZO: A Single Layer Dominates Zeroth-Order Fine-Tuning of LLMs

Submit Feedback