perturbations

Tag

Cards List
#perturbations

Fragile Thoughts: How Large Language Models Handle Chain-of-Thought Perturbations

arXiv cs.CL · 2026-04-20 Cached

This paper presents a comprehensive empirical evaluation of how large language models handle corruptions in chain-of-thought reasoning steps, testing 13 models across 5 perturbation types (MathError, UnitConversion, Sycophancy, SkippedSteps, ExtraSteps) on mathematical reasoning tasks. The findings reveal heterogeneous vulnerability patterns with implications for deploying LLMs in multi-stage reasoning pipelines.

0 favorites 0 likes
#perturbations

Adversarial attacks on neural network policies

OpenAI Blog · 2017-02-08 Cached

OpenAI researchers demonstrate that adversarial attacks, previously studied in computer vision, are also effective against neural network policies in reinforcement learning, showing significant performance degradation even with small imperceptible perturbations in white-box and black-box settings.

0 favorites 0 likes
← Back to home

Submit Feedback