adversarial-examples

#adversarial-examples

Transfer of adversarial robustness between perturbation types

OpenAI Blog ↗ · 2019-05-03 Cached

Researchers study how adversarial robustness transfers across different perturbation types in deep neural networks, evaluating 32 attacks of 5 types on ImageNet models. Results show that robustness to one perturbation type doesn't always transfer to others and may sometimes hurt robustness elsewhere.

0 favorites 0 likes

#adversarial-examples

Introducing Activation Atlases

OpenAI Blog ↗ · 2019-03-06 Cached

OpenAI introduces Activation Atlases, a technique for visualizing and understanding the internal representations of neural networks, enabling humans to discover spurious correlations and unexpected behaviors such as fooling image classifiers by adding noodles to images.

0 favorites 0 likes

#adversarial-examples

Robust adversarial inputs

OpenAI Blog ↗ · 2017-07-17

Researchers demonstrated adversarial images that reliably fool neural network classifiers across multiple scales and perspectives, challenging assumptions about the robustness of multi-scale image capture systems used in autonomous vehicles.

0 favorites 0 likes

#adversarial-examples

Attacking machine learning with adversarial examples

OpenAI Blog ↗ · 2017-02-24 Cached

This article examines adversarial attacks on machine learning models and demonstrates why gradient masking—a defensive technique that attempts to deny attackers access to useful gradients—is fundamentally ineffective. The paper shows that attackers can circumvent gradient masking by training substitute models that mimic the defended model's behavior, making the defense strategy ultimately futile.

0 favorites 0 likes

adversarial-examples

Transfer of adversarial robustness between perturbation types

Introducing Activation Atlases

Robust adversarial inputs

Attacking machine learning with adversarial examples

Submit Feedback