On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters
Summary
This paper explores using parameter-efficient fine-tuning (PEFT) as a compact substrate for persistent personal models, studying scaling up, down, and out, and introduces MinT for managing adapters.
View Cached Full Text
Cached at: 06/02/26, 03:23 AM
Paper page - On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters
Source: https://huggingface.co/papers/2606.02437 Published on Jun 1
#1 Paper of the day Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Abstract
Parameter-efficient fine-tuning can function as a compact substrate for persistent personal models by enabling small trainable adapters to store instance-specific behaviors on top of strong foundation models.
Parameter-efficient fine-tuning(PEFT) is usually treated as a cheaper alternative tofull fine-tuning. We study a broader role: smalltrainable adaptersaspersistent local stateon top of strongshared foundation models. In this framing, the base model provides shared competence while adapters carryinstance-specific behaviorsuch as preferences, skills, tool habits, and memory-like updates. We organize the problem around three scaling axes: Scale Up, where strongershared priorsmake small local updates more useful; Scale Down, where we study how small adapters can be while remaining reliable; and Scale Out, where many persistent adapted instances coexist. MinT provides one infrastructure example for managingadapter identity,revision,provenance,evaluation, andserving residency. Together, the results suggest that PEFT can be a compact substrate for persistent personal models rather than only a budget substitute forfull fine-tuning.
View arXiv pageView PDFAdd to collection
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.02437 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2606.02437 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.02437 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
ShadowPEFT: Shadow Network for Parameter-Efficient Fine-Tuning
ShadowPEFT introduces a centralized parameter-efficient fine-tuning method that uses a depth-shared shadow module to refine transformer layer representations, matching or outperforming LoRA/DoRA with comparable trainable parameters.
@_vmlops: FINE-TUNING A 12B MODEL ON A SINGLE GPU IS REAL NOW most people think you need a massive gpu cluster to fine-tune large…
Hugging Face's PEFT library enables parameter-efficient fine-tuning of large models on a single GPU, reducing compute and storage costs while maintaining performance.
How to Scale Mixture-of-Experts: From muP to the Maximally Scale-Stable Parameterization
This paper develops a principled scaling theory for Mixture-of-Experts (MoE) architectures, introducing the Maximally Scale-Stable Parameterization (MSSP) that ensures stable training and hyperparameter transfer across width, depth, expert width, and number of experts, validated by experiments.
From Parameters to Data: A Task-Parameter-Guided Fine-Tuning Pipeline for Efficient LLM Alignment
P2D is a unified framework that leverages task-sensitive attention heads for both data selection and structural pruning, achieving an 8.3 pp performance gain and 7.0× speedup by updating only 10% of heads on 10% of data.
PEML: Parameter-efficient Multi-Task Learning with Optimized Continuous Prompts
PEML proposes a parameter-efficient multi-task learning method that co-optimizes continuous prompts and model weights via low-rank adaptation. It achieves up to 6.67% average accuracy improvement on multiple benchmarks.