adversarial-examples

#adversarial-examples

Are Safety Guarantees in Neural Networks Safe? How to Compute Trustworthy Robustness Certifications

arXiv cs.LG ↗ · yesterday Cached

This paper introduces the apothem measure for computing trustworthy robustness certifications in neural networks, proves intractability of volume-optimal certifications, and presents the ParallelepipedoNN system achieving twofold improvement in minimum edge length on MNIST and Fashion MNIST.

0 favorites 0 likes

#adversarial-examples

What matters when synthetic training data is generated on demand?

Reddit r/ArtificialInteligence ↗ · 2026-05-14

Abliteration launches a made-to-order synthetic training data workflow that generates negative, rare, and adversarial examples for classifiers, with schema, real-world facts, labels, provenance, and export to platforms like Hugging Face.

0 favorites 0 likes

#adversarial-examples

Transfer of adversarial robustness between perturbation types

OpenAI Blog ↗ · 2019-05-03 Cached

Researchers study how adversarial robustness transfers across different perturbation types in deep neural networks, evaluating 32 attacks of 5 types on ImageNet models. Results show that robustness to one perturbation type doesn't always transfer to others and may sometimes hurt robustness elsewhere.

0 favorites 0 likes

#adversarial-examples

Introducing Activation Atlases

OpenAI Blog ↗ · 2019-03-06 Cached

OpenAI introduces Activation Atlases, a technique for visualizing and understanding the internal representations of neural networks, enabling humans to discover spurious correlations and unexpected behaviors such as fooling image classifiers by adding noodles to images.

0 favorites 0 likes

#adversarial-examples

Robust adversarial inputs

OpenAI Blog ↗ · 2017-07-17

Researchers demonstrated adversarial images that reliably fool neural network classifiers across multiple scales and perspectives, challenging assumptions about the robustness of multi-scale image capture systems used in autonomous vehicles.

0 favorites 0 likes

#adversarial-examples

Attacking machine learning with adversarial examples

OpenAI Blog ↗ · 2017-02-24 Cached

This article examines adversarial attacks on machine learning models and demonstrates why gradient masking—a defensive technique that attempts to deny attackers access to useful gradients—is fundamentally ineffective. The paper shows that attackers can circumvent gradient masking by training substitute models that mimic the defended model's behavior, making the defense strategy ultimately futile.

0 favorites 0 likes

adversarial-examples

Are Safety Guarantees in Neural Networks Safe? How to Compute Trustworthy Robustness Certifications

What matters when synthetic training data is generated on demand?

Transfer of adversarial robustness between perturbation types

Introducing Activation Atlases

Robust adversarial inputs

Attacking machine learning with adversarial examples

Submit Feedback