spurious-correlations

Tag

Cards List
#spurious-correlations

Shortcuts in the Tail: Debiasing via Post-Hoc Spectral Compression of Fine-Tuning Updates

arXiv cs.LG · 2026-06-09 Cached

A post-hoc method reduces spurious correlations in fine-tuned LLMs by truncating the tail of the SVD of the weight update matrix. It reduces the spurious-group gap by up to 5x with less than 2pp accuracy loss, without retraining or group labels.

0 favorites 0 likes
#spurious-correlations

SpurAudio: A Benchmark for Studying Shortcut Learning in Few-Shot Audio Classification

arXiv cs.AI · 2026-06-04 Cached

SpurAudio is a new benchmark designed to evaluate shortcut learning and spurious correlations in few-shot audio classification, revealing that state-of-the-art methods—including large pretrained audio foundation models—suffer significant performance degradation when background correlations are disrupted.

0 favorites 0 likes
#spurious-correlations

Mitigating Spurious Correlations with Memorization-Guided Dataset De-Biasing

arXiv cs.LG · 2026-06-03 Cached

The paper proposes a method to mitigate spurious correlations by disentangling learning dynamics of core and spurious features using a two-stage sample scoring function, achieving state-of-the-art debiasing performance with only 10% of training data.

0 favorites 0 likes
#spurious-correlations

Spurious Correlation Learning in Preference Optimization: Mechanisms, Consequences, and Mitigation via Tie Training

arXiv cs.LG · 2026-05-13 Cached

This paper analyzes spurious correlation learning in preference optimization methods like DPO, identifying mechanisms such as mean spurious bias and causal-spurious leakage. It proposes 'tie training' using equal-utility preference pairs as a mitigation strategy to reduce reliance on spurious features without degrading causal learning.

0 favorites 0 likes
#spurious-correlations

Product-of-Experts Training Reduces Dataset Artifacts in Natural Language Inference

arXiv cs.CL · 2026-04-22 Cached

This paper proposes Product-of-Experts (PoE) training to reduce dataset artifacts in Natural Language Inference, downweighting examples where biased models are overconfident. PoE nearly preserves accuracy on SNLI (89.10% vs. 89.30%) while reducing bias reliance by ~4.85 percentage points.

0 favorites 0 likes
← Back to home

Submit Feedback