parameter-space

Tag

Cards List
#parameter-space

Sparsity Curse: Understanding RLVR Model Parameter Space from Model Merging

arXiv cs.LG · 2026-06-18 Cached

This paper investigates the 'sparsity curse' in merging RLVR (Reinforcement Learning with Verifiable Reward) models, finding that sparse updates cause near-orthogonal parameter directions that hinder aggregation, and proposes SAR-Merging, which uses Fisher information and sparsification to resolve conflicts and improve merging performance on math and coding tasks.

0 favorites 0 likes
#parameter-space

On the Geometry of On-Policy Distillation

Hugging Face Daily Papers · 2026-06-05 Cached

This paper characterizes the unique parameter space dynamics of on-policy distillation (OPD) for large language models, showing that it exhibits relaxed off-principal updates and subspace locking, distinguishing it from supervised fine-tuning and reinforcement learning with verifiable rewards.

0 favorites 0 likes
#parameter-space

CapVector: Learning Transferable Capability Vectors in Parametric Space for Vision-Language-Action Models

Hugging Face Daily Papers · 2026-05-11 Cached

This paper introduces CapVector, a method that decouples auxiliary training objectives from standard supervised finetuning in Vision-Language-Action models. By extracting transferable capability vectors and applying orthogonal regularization, it enhances model performance and generalization while significantly reducing computational overhead.

0 favorites 0 likes
← Back to home

Submit Feedback