Tag
The paper proposes a method to mitigate spurious correlations by disentangling learning dynamics of core and spurious features using a two-stage sample scoring function, achieving state-of-the-art debiasing performance with only 10% of training data.
This paper investigates memorization in diffusion models and finds that they preferentially memorize prototypical examples with common substrings, even after deduplication, and that early stopping leads to an overproduction of common motifs, dubbed 'slop'.
This paper introduces NumLeak, a framework for detecting when foundation models memorize public numeric benchmarks from pretraining rather than demonstrating out-of-sample skill, and shows that top LLMs recall values like Fama-French returns with high fidelity, proposing a simple system-prompt defense.
This paper introduces infilling extraction, a new method for extracting training data from diffusion language models by using arbitrary binary masks, showing that such models are more vulnerable to memorization attacks than previously thought.
This paper introduces a novel task, transitive inference with exceptions, and analytically characterizes how neural network models (kernel ridge regression) balance relational generalization and memorization. The theory is validated in pretrained language models, showing systematic mistakes predicted by the theory.
This paper studies how fill-in-the-middle (FIM) pretraining affects verbatim memorization, finding that FIM more often recovers short spans while standard left-to-right training recovers long exact continuations, and that memorization under FIM grows linearly with repetitions.
This paper introduces Zero-CoT Probe (ZCP), a black-box detection method that identifies evasive data contamination in LLMs by truncating chain-of-thought reasoning and comparing performance on perturbed datasets, achieving robust detection of both direct and indirect contamination.
Vocabi is a tool that helps users translate, save, and memorize words while they read.