BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling
Summary
BrainSurgery is a tool for reproducible and declarative weight manipulations on neural network checkpoints, enabling model editing and upcycling through YAML plans with built-in validation.
View Cached Full Text
Cached at: 06/10/26, 09:44 AM
Paper page - BrainSurgery: Reproducible and Reliable Declarative Weight Manipulations for Model Editing and Upcycling
Source: https://huggingface.co/papers/2606.09707
Abstract
BrainSurgery is a tool for robust and reproducible tensor manipulation of neural network checkpoints through declarative YAML plans with built-in validation.
As deep learning models scale, managing, inspecting, and modifying large checkpoints has become increasingly challenging. Researchers often need to alter model weights for layer restructuring, precision casting, low-rank factorization, and architectural debugging, yet these workflows often rely on fragile ad-hoc Python scripts. Here, we introduce BrainSurgery, a tool for robust and reproducible “tensor surgery” onneural network checkpoints, and provide a system demonstration covering four examples and three case studies from model upcycling to LoRA extraction. By abstracting storage formats and memory management, BrainSurgery executes complex transformations throughdeclarative YAML plans. It supportsstructural modifications,mathematical transformations, andtensor reshapingthrough expressiveregexandstructural targeting, while built-inassertionsvalidatetensor shapes,data types, andvaluesto prevent silent errors. We envision that BrainSurgery will provide a strong foundation for future research through its reproducible and validated operations.
View arXiv pageView PDFGitHub3Add to collection
Get this paper in your agent:
hf papers read 2606\.09707
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2606.09707 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2606.09707 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2606.09707 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
BrainG3N: A Dual-Purpose Tokenizer for Controllable 3D Brain MRI Generation
Introduces BrainG3N, a dual-purpose tokenizer for 3D brain MRI latent diffusion using a frozen masked autoencoder encoder for clinically informative embeddings and a CNN decoder for reconstruction, achieving state-of-the-art performance on a 23-task benchmark and enabling controllable generation and longitudinal forecasting.
Data-centric debugging for teams training neural nets [P]
WeightsLab is an open-source, PyTorch-native tool that allows teams to pause training, inspect live loss signals, and catch data issues like mislabels and class imbalance before they affect model performance. It is designed for computer vision engineers working with images, videos, and LiDAR point clouds.
Weight Decay Regimes in Grokking Transformers: Cheap Online Diagnostics
This paper investigates how weight decay acts as a control parameter for transitioning between memorization and generalization in transformers trained on modular arithmetic, and introduces two cheap online diagnostic metrics from attention activations that track these dynamics.
Task-Restricted Symmetries in Recurrent Weight Space
This paper studies functional redundancy in recurrent neural networks by using ordered real Schur coordinates to identify structured ablations that preserve task performance, finding that task-restricted symmetries vary across tasks and trained solutions.
@AnneliesGamble: https://x.com/AnneliesGamble/status/2066949973749755919
An exploration of why mapping the brain's connectome is valuable, arguing that unlike AI systems where design is in code outside weights, brains must encode all design physically, making architecture the key to understanding.