shortcut-learning

#shortcut-learning

The Readout Shortcut: Positional Number Copying Dominates Arithmetic CoT Readout in Small Language Models

arXiv cs.LG ↗ · 2026-05-25 Cached

This paper identifies a 'positional copying' shortcut where small language models answer arithmetic questions by copying the last number before the answer delimiter, bypassing actual reasoning. This effect explains why shuffling CoT steps retains performance; it accounts for 89-92% of teacher-forcing accuracy in 1-3B models on GSM8K.

0 favorites 0 likes

#shortcut-learning

Shortcut Solutions Learned by Transformers Impair Continual Compositional Reasoning

arXiv cs.LG ↗ · 2026-05-08 Cached

This research paper investigates how shortcut solutions learned by Transformer models, specifically BERT, impair their ability to perform continual compositional reasoning. It contrasts BERT with ALBERT, finding that ALBERT's recurrent nature offers better inductive bias for continual learning tasks.

0 favorites 0 likes

#shortcut-learning

Chain-of-Thought Degrades Visual Spatial Reasoning Capabilities of Multimodal LLMs

Hugging Face Daily Papers ↗ · 2026-04-17 Cached

Research shows Chain-of-Thought prompting harms visual-spatial reasoning in multimodal LLMs due to shortcut learning and hallucinating visual details from text alone.

0 favorites 0 likes

shortcut-learning

The Readout Shortcut: Positional Number Copying Dominates Arithmetic CoT Readout in Small Language Models

Shortcut Solutions Learned by Transformers Impair Continual Compositional Reasoning

Chain-of-Thought Degrades Visual Spatial Reasoning Capabilities of Multimodal LLMs

Submit Feedback