robustness

#robustness

Robustness of Graph Self-Supervised Learning to Real-World Noise: A Case Study on Text-Driven Biomedical Graphs

arXiv cs.LG ↗ · 2d ago Cached

This paper introduces NATD-GSSL, a framework evaluating the robustness of Graph Self-Supervised Learning on noisy, text-driven biomedical graphs. It demonstrates that certain GNN architectures and pretext tasks maintain performance despite real-world noise, offering practical guidance for unsupervised learning in imperfect datasets.

0 favorites 0 likes

#robustness

Chainwash: Multi-Step Rewriting Attacks on Diffusion Language Model Watermarks

arXiv cs.CL ↗ · 2d ago Cached

This research paper introduces Chainwash, a multi-step rewriting attack that effectively removes statistical watermarks from diffusion language model (LLaDA-8B-Instruct) outputs, reducing detection rates from 87.9% to 4.86% after five chained rewrites.

0 favorites 0 likes

#robustness

Are We Making Progress in Multimodal Domain Generalization? A Comprehensive Benchmark Study

Hugging Face Daily Papers ↗ · 3d ago Cached

This paper introduces MMDG-Bench, a unified benchmark for multimodal domain generalization that reveals limited progress in current methods and significant robustness challenges across diverse tasks.

0 favorites 0 likes

#robustness

Experiments or Outcomes? Probing Scientific Feasibility in Large Language Models

arXiv cs.CL ↗ · 2026-04-22 Cached

UMBC researchers show LLMs judge scientific claim feasibility better when given outcome data than experiment descriptions, and that incomplete experimental context can hurt accuracy.

0 favorites 0 likes

#robustness

When Informal Text Breaks NLI: Tokenization Failure, Distribution Shift, and Targeted Mitigations

arXiv cs.CL ↗ · 2026-04-21 Cached

This paper investigates how informal text (slang, emoji, Gen-Z filler tokens) degrades NLI accuracy in ELECTRA-small and RoBERTa-large models, identifying two distinct failure mechanisms—tokenization failure (emoji mapped to [UNK]) and distribution shift (out-of-domain noise tokens)—and proposes targeted mitigations that recover accuracy without harming clean-text performance.

0 favorites 0 likes

#robustness

Measuring Representation Robustness in Large Language Models for Geometry

arXiv cs.CL ↗ · 2026-04-21 Cached

Researchers introduce GeoRepEval, a framework to evaluate LLM robustness across equivalent geometric problem representations (Euclidean, coordinate, vector). Testing 11 LLMs on 158 geometry problems, they find accuracy gaps up to 14 percentage points based solely on representation choice, with vector formulations being a consistent failure point.

0 favorites 0 likes

#robustness

On the Robustness of LLM-Based Dense Retrievers: A Systematic Analysis of Generalizability and Stability

Hugging Face Daily Papers ↗ · 2026-04-17 Cached

Systematic study shows LLM-based dense retrievers outperform BERT baselines on typos and poisoning but remain vulnerable to semantic perturbations, with embedding geometry predicting robustness.

0 favorites 0 likes

#robustness

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

OpenAI Blog ↗ · 2024-04-19 Cached

OpenAI proposes an instruction hierarchy approach to defend LLMs against prompt injection and jailbreak attacks by training models to prioritize system instructions over user inputs. The method significantly improves robustness without degrading standard capabilities.

0 favorites 0 likes

#robustness

Robust adversarial inputs

OpenAI Blog ↗ · 2017-07-17

Researchers demonstrated adversarial images that reliably fool neural network classifiers across multiple scales and perspectives, challenging assumptions about the robustness of multi-scale image capture systems used in autonomous vehicles.

0 favorites 0 likes

#robustness

Attacking machine learning with adversarial examples

OpenAI Blog ↗ · 2017-02-24 Cached

This article examines adversarial attacks on machine learning models and demonstrates why gradient masking—a defensive technique that attempts to deny attackers access to useful gradients—is fundamentally ineffective. The paper shows that attackers can circumvent gradient masking by training substitute models that mimic the defended model's behavior, making the defense strategy ultimately futile.

0 favorites 0 likes

#robustness

Adversarial attacks on neural network policies

OpenAI Blog ↗ · 2017-02-08 Cached

OpenAI researchers demonstrate that adversarial attacks, previously studied in computer vision, are also effective against neural network policies in reinforcement learning, showing significant performance degradation even with small imperceptible perturbations in white-box and black-box settings.

0 favorites 0 likes

#robustness

Concrete AI safety problems

OpenAI Blog ↗ · 2016-06-21 Cached

OpenAI, Berkeley, and Stanford researchers co-authored a foundational paper identifying five concrete safety problems in modern AI systems: safe exploration, robustness to distributional shift, avoiding negative side effects, preventing reward hacking, and scalable oversight.

0 favorites 0 likes

robustness

Submit Feedback