Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips

Hugging Face Daily Papers 04/16/26, 12:00 AM Papers

Summary

This paper demonstrates that deep neural networks are catastrophically vulnerable to minimal sign-bit flips in parameters, introducing DNL and 1P-DNL methods to identify critical vulnerable parameters without data or optimization. The vulnerability spans multiple domains including image classification, object detection, instance segmentation, and language models, with practical implications for model security.

Deep Neural Networks (DNNs) can be catastrophically disrupted by flipping only a handful of parameter bits. We introduce Deep Neural Lesion (DNL), a data-free and optimizationfree method that locates critical parameters, and an enhanced single-pass variant, 1P-DNL, that refines this selection with one forward and backward pass on random inputs. We show that this vulnerability spans multiple domains, including image classification, object detection, instance segmentation, and reasoning large language models. In image classification, flipping just two sign bits in ResNet-50 on ImageNet reduces accuracy by 99.8%. In object detection and instance segmentation, one or two sign flips in the backbone collapse COCO detection and mask AP for Mask R-CNN and YOLOv8-seg models. In language modeling, two sign flips into different experts reduce Qwen3-30B-A3B-Thinking from 78% to 0% accuracy. We also show that selectively protecting a small fraction of vulnerable sign bits provides a practical defense against such attacks.

Original Article Export to Word Export to PDF

View Cached Full Text

Cached at: 04/20/26, 08:26 AM

Paper page - Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips

Source: https://huggingface.co/papers/2502.07408

Abstract

Deep neural networks exhibit catastrophic vulnerability to minimal parameter bit flips across multiple domains, which can be identified and mitigated through targeted protection strategies.

Deep Neural Networks (https://huggingface.co/papers?q=Deep%20Neural%20Networks) (DNNs) can be catastrophically disrupted by flipping only a handful of parameter bits (https://huggingface.co/papers?q=parameter%20bits). We introduce Deep Neural Lesion (https://huggingface.co/papers?q=Deep%20Neural%20Lesion) (DNL), a data-free and optimization-free method that locates critical parameters, and an enhanced single-pass variant, 1P-DNL (https://huggingface.co/papers?q=1P-DNL), that refines this selection with one forward and backward pass on random inputs. We show that this vulnerability spans multiple domains, including image classification, object detection (https://huggingface.co/papers?q=object%20detection), instance segmentation (https://huggingface.co/papers?q=instance%20segmentation), and reasoning large language models. In image classification, flipping just two sign bits (https://huggingface.co/papers?q=sign%20bits) in ResNet-50 (https://huggingface.co/papers?q=ResNet-50) on ImageNet (https://huggingface.co/papers?q=ImageNet) reduces accuracy by 99.8%. In object detection (https://huggingface.co/papers?q=object%20detection) and instance segmentation (https://huggingface.co/papers?q=instance%20segmentation), one or two sign flips in the backbone collapse COCO detection and mask AP for Mask R-CNN (https://huggingface.co/papers?q=Mask%20R-CNN) and YOLOv8-seg (https://huggingface.co/papers?q=YOLOv8-seg) models. In language modeling (https://huggingface.co/papers?q=language%20modeling), two sign flips into different experts reduce Qwen3-30B-A3B-Thinking (https://huggingface.co/papers?q=Qwen3-30B-A3B-Thinking) from 78% to 0% accuracy. We also show that selectively protecting a small fraction of vulnerable sign bits (https://huggingface.co/papers?q=sign%20bits) provides a practical defense against such attacks.

View arXiv page (https://arxiv.org/abs/2502.07408) View PDF (https://arxiv.org/pdf/2502.07408) Project page (https://mkimhi.github.io/DNL/) GitHub0 (https://github.com/IdoGalil/maximal-brain-damage) Add to collection (https://huggingface.co/login?next=%2Fpapers%2F2502.07408)

Get this paper in your agent:

hf papers read 2502.07408

Don’t have the latest CLI? curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2502.07408 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2502.07408 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2502.07408 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to a collection (https://huggingface.co/new-collection) to link it from this page.

Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips

Paper page - Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

Adversarial attacks on neural network policies

Are Flat Minima an Illusion?

From Signal Degradation to Computation Collapse: Uncovering the Two Failure Modes of LLM Quantization

When Background Matters: Breaking Medical Vision Language Models by Transferable Attack

Understanding neural networks through sparse circuits

Submit Feedback

Similar Articles

Adversarial attacks on neural network policies

From Signal Degradation to Computation Collapse: Uncovering the Two Failure Modes of LLM Quantization

When Background Matters: Breaking Medical Vision Language Models by Transferable Attack

Understanding neural networks through sparse circuits