distribution-shift

#distribution-shift

@Rossst_03: Kian Katanforoosh, Stanford CS lecturer (Forbes 30 Under 30): "Two Sigma pays $650K a year to researchers who can train…

X AI KOLs Timeline ↗ · 3d ago Cached

A tweet critiques a viral thread that resells a free Stanford lecture on neural networks as a secret trading framework, highlighting that real expertise lies in handling distribution shifts, not the math itself.

0 favorites 0 likes

#distribution-shift

Are Tabular Foundation Models Robust to Realistic Query Distribution Shifts in Microbiome Data?

arXiv cs.LG ↗ · 4d ago Cached

This paper evaluates the robustness of tabular foundation models to biologically inspired distribution shifts in microbiome data, finding that protecting discriminative features is insufficient and zero-imputation is the most harmful perturbation.

0 favorites 0 likes

#distribution-shift

Hitting a Moving Target: Test-Time Adaptation for AI Text Detection under Continual Distribution Shift

arXiv cs.CL ↗ · 4d ago Cached

This paper proposes a test-time adaptation approach using semi-supervised learning for AI text detection that adapts to continual distribution shifts from new LLMs, adversarial humanization, and temporal drift, outperforming state-of-the-art supervised detectors.

0 favorites 0 likes

#distribution-shift

When Multi-Sensor Fusion Fails to Generalize: Cattle Posture Classification Under Animal-Level and Temporal Distribution Shift

arXiv cs.LG ↗ · 4d ago Cached

This paper evaluates the robustness of multi-sensor fusion for cattle posture classification under temporal distribution shift, finding that multimodal models suffer significant performance drops and that simpler single-sensor models generalize better, highlighting shortcut learning issues.

0 favorites 0 likes

#distribution-shift

U-TTT: Towards Generalizable PET Image Denoising via Test-Time Training

Hugging Face Daily Papers ↗ · 2026-06-09 Cached

This paper introduces U-TTT, a U-shaped deep learning model with test-time training layers and dual-domain adaptation for robust PET image denoising under distribution shifts, achieving state-of-the-art performance across different dose levels and scanner types.

0 favorites 0 likes

#distribution-shift

SCALE: Scalable Cross-Attention Learning with Extrapolation for Agentic Workflow Scheduling

arXiv cs.LG ↗ · 2026-06-08 Cached

This paper proposes SCALE, a deep reinforcement learning scheduler for agentic LLM workflow DAGs that generalizes to unseen cluster sizes using cross-attention and structured representation regularization, reducing response time without retraining.

0 favorites 0 likes

#distribution-shift

When Offline Selectors Cannot Beat the Best Single Model: A Diagnostic Study on edX Dropout Prediction

arXiv cs.LG ↗ · 2026-06-04 Cached

This paper proposes a three-stage diagnostic framework to identify why offline model selectors fail to beat the best single model, applying it to dropout prediction on edX clickstream data. The study finds that the bottleneck is local representational ambiguity rather than learner choice or distribution shift, recommending state redesign or new data collection over further algorithm tuning.

0 favorites 0 likes

#distribution-shift

The Reliability Gap in Benchmark Auditing: Distribution Shift and Scale as Failure Modes of Contamination Detection

arXiv cs.AI ↗ · 2026-06-03 Cached

This paper identifies distribution shift and scale constraints as critical failure modes for statistical contamination detection methods in LLM benchmark auditing. Evaluating three paradigms across 27 models reveals only 199 correct outcomes out of 335 evaluations, indicating a systematic reliability gap that prevents these methods from replacing transparent data provenance.

0 favorites 0 likes

#distribution-shift

Regime-Arrival Uncertainty in Generalization Bounds under Distribution Shift

arXiv cs.LG ↗ · 2026-06-03 Cached

This paper introduces a theoretical framework for quantifying deployment risk when training and deployment distributions differ due to latent regime dynamics modeled as a Markov-switching process, providing exact decomposition and finite-sample bounds.

0 favorites 0 likes

#distribution-shift

TASER: Task-Aware Stein Regularisation for Geometry-Driven Robustness

arXiv cs.LG ↗ · 2026-06-01 Cached

Introduces TASER, a training-time regularization framework derived from Langevin Stein operators that encourages geometric compatibility between predictors and data density, improving adversarial robustness and stability on CIFAR-10 without significant clean accuracy degradation.

0 favorites 0 likes

#distribution-shift

From Static Context to Calibrated Interactive RL: Mitigating Distribution Shift in Multi-turn Dialogue with Aligned Simulator

arXiv cs.AI ↗ · 2026-05-27 Cached

This paper theoretically identifies and mitigates context distribution shift in multi-turn dialogue RL, proposing Calibrated Interactive RL that couples interactive RL with simulator alignment to reduce the sim-to-real gap and achieve state-of-the-art performance.

0 favorites 0 likes

#distribution-shift

MARGIN: Runtime Confidence Calibration for Multi-Agent Foundation Model Coordination

arXiv cs.LG ↗ · 2026-05-25 Cached

MARGIN is a runtime confidence calibration method for multi-agent foundation model systems that learns per-agent calibration factors online, improving pairwise resolution from below random to 70-89% on hard benchmarks, requiring no held-out data or retraining.

0 favorites 0 likes

#distribution-shift

MMD-Balls as Credal Sets: A PAC-Bayesian Framework for Epistemic Uncertainty in Test-Time Adaptation

arXiv cs.LG ↗ · 2026-05-22 Cached

This paper develops a PAC-Bayesian framework for test-time adaptation that uses MMD-balls as credal sets, providing formal generalization bounds and separating epistemic from aleatoric uncertainty under distribution shift.

0 favorites 0 likes

#distribution-shift

PIMSM: Physics-Informed Multi-Scale Mamba for Stable Neural Representations under Distribution Shift

arXiv cs.LG ↗ · 2026-05-19 Cached

This paper proposes Physics-Informed Multi-Scale Mamba (PIMSM), a state-space architecture that aligns model memory with physical timescales to improve robustness under distribution shift in scientific time series, demonstrating improvements on fMRI and weather forecasting tasks.

0 favorites 0 likes

#distribution-shift

ICRL: Learning to Internalize Self-Critique with Reinforcement Learning

arXiv cs.AI ↗ · 2026-05-18 Cached

This paper introduces ICRL, a framework that jointly trains a solver and critic with reinforcement learning to internalize critique guidance, enabling the solver to improve without external critique. It uses distribution calibration and role-wise group advantage estimation, achieving 6-7 point gains over GRPO on agentic and mathematical reasoning tasks.

0 favorites 0 likes

#distribution-shift

When Informal Text Breaks NLI: Tokenization Failure, Distribution Shift, and Targeted Mitigations

arXiv cs.CL ↗ · 2026-04-21 Cached

This paper investigates how informal text (slang, emoji, Gen-Z filler tokens) degrades NLI accuracy in ELECTRA-small and RoBERTa-large models, identifying two distinct failure mechanisms—tokenization failure (emoji mapped to [UNK]) and distribution shift (out-of-domain noise tokens)—and proposes targeted mitigations that recover accuracy without harming clean-text performance.

0 favorites 0 likes

#distribution-shift

Beyond Surface Statistics: Robust Conformal Prediction for LLMs via Internal Representations

arXiv cs.CL ↗ · 2026-04-20 Cached

This paper proposes a conformal prediction framework for LLMs that leverages internal representations rather than output-level statistics, introducing Layer-Wise Information (LI) scores as nonconformity measures to improve validity-efficiency trade-offs under distribution shift. The method demonstrates stronger robustness to calibration-deployment mismatch compared to text-level baselines across QA benchmarks.

0 favorites 0 likes

distribution-shift

Submit Feedback