MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning
Summary
# Paper page - MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning Source: [https://huggingface.co/papers/2605.07850](https://huggingface.co/papers/2605.07850) We propose**MatryoshkaLoRA**, a general, Matryoshka\-inspired training framework for LoRA that learns accurate hierarchical low\-rank representations by inserting a fixed, carefully crafted diagonal matrix**P**between the existing LoRA adapters to scale their sub\-ranks accordingly\. By introducing
View Cached Full Text
Cached at: 05/11/26, 10:44 AM
Paper page - MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning
Source: https://huggingface.co/papers/2605.07850 We proposeMatryoshkaLoRA, a general, Matryoshka-inspired training framework for LoRA that learns accurate hierarchical low-rank representations by inserting a fixed, carefully crafted diagonal matrixPbetween the existing LoRA adapters to scale their sub-ranks accordingly.
By introducing this simple modification, our general framework recovers LoRA and DyLoRA only by changingPand ensures all sub-ranks embed the available gradient information efficiently.
OurMatryoshkaLoRAsupports dynamic rank selection with minimal degradation in accuracy. We further proposeArea Under the Rank Accuracy Curve (AURAC), a metric that consistently evaluates the performance of hierarchical low-rank adapters.
Our results show that thatMatryoshkaLoRAlearns more accurate hierarchical low-rank representations than prior rank-adaptive approaches and achieves superior accuracy-performance trade-offs across ranks on the evaluated datasets.
Similar Articles
BaLoRA: Bayesian Low-Rank Adaptation of Large Scale Models
BaLoRA introduces a Bayesian extension to Low-Rank Adaptation (LoRA) that provides calibrated uncertainty estimates and improves prediction accuracy by narrowing the gap with full fine-tuning.
Queryable LoRA: Instruction-Regularized Routing Over Shared Low-Rank Update Atoms
Introduces Queryable LoRA, a data-adaptive method for efficient fine-tuning that uses a shared memory of low-rank update atoms with attention-based routing and instruction regularization to enable dynamic, context-sensitive parameter updates while maintaining scalability.
Beyond LoRA vs. Full Fine-Tuning: Gradient-Guided Optimizer Routing for LLM Adaptation
This paper proposes a Mixture of LoRA and Full (MoLF) fine-tuning framework that uses gradient-guided optimizer routing to adaptively switch between LoRA and full fine-tuning. It aims to overcome the structural limitations of relying solely on static adaptation methods by combining the plasticity of full tuning with the regularization of LoRA.
Echo-LoRA: Parameter-Efficient Fine-Tuning via Cross-Layer Representation Injection
The article introduces Echo-LoRA, a new parameter-efficient fine-tuning method that injects cross-layer representations from deeper source layers into shallow LoRA modules to improve performance without adding inference-time overhead.
JumpLoRA: Sparse Adapters for Continual Learning in Large Language Models
JumpLoRA introduces a novel sparse adapter framework for continual learning in LLMs using JumpReLU gating to dynamically isolate task parameters and prevent catastrophic forgetting. The method enhances LoRA-based approaches and outperforms state-of-the-art continual learning methods like ELLA.
