Crowded in B-Space: Calibrating Shared Directions for LoRA Merging
Summary
This paper introduces Pico, a data-free method that improves LoRA adapter merging by separately calibrating the output-side matrix B to reduce interference from shared directions while preserving task-specific information. Pico achieves 3.4–8.3 point accuracy improvements over existing merging methods across math, coding, finance, and medical benchmarks.
View Cached Full Text
Cached at: 04/21/26, 07:21 AM
Paper page - Crowded in B-Space: Calibrating Shared Directions for LoRA Merging
Source: https://huggingface.co/papers/2604.16826 Published on Apr 18
·
Submitted byhttps://huggingface.co/yixuantt
yixuanon Apr 21
Abstract
LoRA adapter merging performance can be improved by separately calibrating the output-side matrix B to reduce interference from shared directions while preserving task-specific information.
Merging separately trainedLoRA adaptersis a practical alternative to joint multi-task training, but it often hurts performance. Existing methods usually treat theLoRA updateΔW = BA as a single object and do not distinguish the two LoRA matrices. We show that the main source of LoRA merge interference comes from the output-side matrix B. Across tasks, B repeatedly uses a small set ofshared directions, while A remains much more task-specific. As a result, the merged adapter overemphasizes theseshared directions, andtask-specific informationis lost. We propose Pico (Pre-merge interference calibrationinoutput-space), a data-free method that calibrates B before merge by downscaling over-shared directionsand then rescaling the merged update. Pico plugs directly into existing merging methods such asTask Arithmetic,TIES, andTSV-M. Across eight different benchmarks from math, coding, finance, and medical domains, Pico improves average accuracy by 3.4-8.3 points over the corresponding base method and achieves the best overall average performance. Pico also enablesmerged adaptersto outperform the LoRA trained with all task data. These results show that LoRA merging works better when the two LoRA matrices are treated separately.
View arXiv pageView PDFAdd to collection
Get this paper in your agent:
hf papers read 2604\.16826
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2604.16826 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2604.16826 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2604.16826 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
RDP LoRA: Geometry-Driven Identification for Parameter-Efficient Adaptation in Large Language Models
RDP-LoRA uses geometric trajectory analysis and the Ramer-Douglas-Peucker algorithm to automatically select the most impactful layers for parameter-efficient fine-tuning, outperforming full-layer and random LoRA baselines.
SAMoRA: Semantic-Aware Mixture of LoRA Experts for Task-Adaptive Learning
SAMoRA introduces a semantic-aware router and task-adaptive scaling to improve expert specialization and dynamic weighting in MoE-LoRA fine-tuning, outperforming prior methods on multi-task benchmarks.
Aletheia: Gradient-Guided Layer Selection for Efficient LoRA Fine-Tuning Across Architectures
Aletheia introduces a gradient-guided layer selection method for efficient LoRA fine-tuning that identifies task-relevant transformer layers via lightweight gradient probes and applies adapters selectively, achieving 15-28% training speedup across 14 models while maintaining downstream performance on MMLU, GSM8K, and HumanEval benchmarks.
MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction
MiniCPM-o 4.5 is a 9B parameter multimodal model featuring Omni-Flow, a framework enabling real-time full-duplex interaction where the model can simultaneously perceive and respond proactively. It achieves state-of-the-art open-source performance comparable to Gemini 2.5 Flash and runs on edge devices with less than 12GB RAM.
JumpLoRA: Sparse Adapters for Continual Learning in Large Language Models
JumpLoRA introduces a novel sparse adapter framework for continual learning in LLMs using JumpReLU gating to dynamically isolate task parameters and prevent catastrophic forgetting. The method enhances LoRA-based approaches and outperforms state-of-the-art continual learning methods like ELLA.