distribution-shift

#distribution-shift

Scoped Verification for Reliable Long-Horizon Agentic Context Evolution under Distribution Shift

arXiv cs.AI ↗ · 2026-07-13 Cached

GRACE uses a typed semantic graph to represent persistent instructions for LLM agents, enabling scoped verification of updates to improve reliability under distribution shift. Experiments on a telecom agent harness show significant improvements in strict reliability over baselines.

0 favorites 0 likes

#distribution-shift

PARA-PV: Physics-Aware Retrieval-Augmented PV Prediction Based on Frozen Foundation Model and Distribution Shift Correction

arXiv cs.AI ↗ · 2026-07-10 Cached

Introduces PARA-PV, a physics-aware retrieval-augmented framework for photovoltaic power forecasting that uses a frozen Chronos time-series foundation model and distribution shift correction to improve accuracy and handle peak, ramping, and low-power conditions.

0 favorites 0 likes

#distribution-shift

NEST: Tackling Dataset-Level Distribution Shifts via Regime-Oriented Mixture-of-Experts

arXiv cs.LG ↗ · 2026-07-09 Cached

This paper introduces NEST, a framework using a regime-oriented mixture-of-experts to handle dataset-level distribution shifts in time series forecasting, achieving state-of-the-art on various benchmarks.

0 favorites 0 likes

#distribution-shift

Drift Happens: An Empirical Study of Neural Architecture Robustness to Temporal Distribution Shift

arXiv cs.LG ↗ · 2026-07-08 Cached

This paper presents an empirical study comparing how different neural architectures (MLPs, CNNs, RNNs, pretrained transformers) degrade under temporal distribution shift across image and text domains, finding that models exploiting localized features degrade fastest while pretrained encoders drift more gradually.

0 favorites 0 likes

#distribution-shift

Domain-Adaptive Climate Downscaling Under Temporal Distribution Shift

arXiv cs.LG ↗ · 2026-07-08 Cached

This paper investigates temporal out-of-distribution shift in deep-learning-based climate downscaling and proposes a domain-adaptive framework that combines supervised reconstruction with domain alignment to improve high-resolution climate projections under non-stationary conditions.

0 favorites 0 likes

#distribution-shift

Demonstrating Generalization Failures via Mixtures of Conditional Policies

arXiv cs.AI ↗ · 2026-07-07 Cached

This paper proposes a method to construct language models that exhibit controllable generalization failures when trained with reinforcement learning, demonstrating that training success can diverge from generalization in structured ways.

0 favorites 0 likes

#distribution-shift

Loss Smoothing for Stable Adaptation Under Distribution Shift

arXiv cs.LG ↗ · 2026-07-02 Cached

Loss smoothing interpolates between source and target objectives during adaptation, preserving useful features while enabling specialization. Experiments across supervised shifts, RL, and language model fine-tuning show consistent improvements.

0 favorites 0 likes

#distribution-shift

ComMem: Complementary Memory Systems for Test-Time Adaptation of Vision-Language Models

arXiv cs.AI ↗ · 2026-06-30 Cached

ComMem proposes complementary memory systems inspired by biological memory to improve test-time adaptation of vision-language models, outperforming state-of-the-art on 15 benchmarks.

0 favorites 0 likes

#distribution-shift

When Can Conformal Risk Control Certify LLM Outputs? Bounds, Impossibility, and Adaptation for Structured Generation

arXiv cs.LG ↗ · 2026-06-30 Cached

This paper characterizes when conformal risk control can certify structured LLM outputs, proving impossibility bounds and analyzing certification hierarchies across different bounds. Empirical validation on six open-weight models shows that hard configurations are uncertifiable at low risk levels but practical certification is achievable at relaxed targets.

0 favorites 0 likes

#distribution-shift

@Rossst_03: Kian Katanforoosh, Stanford CS lecturer (Forbes 30 Under 30): "Two Sigma pays $650K a year to researchers who can train…

X AI KOLs Timeline ↗ · 2026-06-26 Cached

A tweet critiques a viral thread that resells a free Stanford lecture on neural networks as a secret trading framework, highlighting that real expertise lies in handling distribution shifts, not the math itself.

0 favorites 0 likes

#distribution-shift

Are Tabular Foundation Models Robust to Realistic Query Distribution Shifts in Microbiome Data?

arXiv cs.LG ↗ · 2026-06-25 Cached

This paper evaluates the robustness of tabular foundation models to biologically inspired distribution shifts in microbiome data, finding that protecting discriminative features is insufficient and zero-imputation is the most harmful perturbation.

0 favorites 0 likes

#distribution-shift

Hitting a Moving Target: Test-Time Adaptation for AI Text Detection under Continual Distribution Shift

arXiv cs.CL ↗ · 2026-06-25 Cached

This paper proposes a test-time adaptation approach using semi-supervised learning for AI text detection that adapts to continual distribution shifts from new LLMs, adversarial humanization, and temporal drift, outperforming state-of-the-art supervised detectors.

0 favorites 0 likes

#distribution-shift

When Multi-Sensor Fusion Fails to Generalize: Cattle Posture Classification Under Animal-Level and Temporal Distribution Shift

arXiv cs.LG ↗ · 2026-06-25 Cached

This paper evaluates the robustness of multi-sensor fusion for cattle posture classification under temporal distribution shift, finding that multimodal models suffer significant performance drops and that simpler single-sensor models generalize better, highlighting shortcut learning issues.

0 favorites 0 likes

#distribution-shift

U-TTT: Towards Generalizable PET Image Denoising via Test-Time Training

Hugging Face Daily Papers ↗ · 2026-06-09 Cached

This paper introduces U-TTT, a U-shaped deep learning model with test-time training layers and dual-domain adaptation for robust PET image denoising under distribution shifts, achieving state-of-the-art performance across different dose levels and scanner types.

0 favorites 0 likes

#distribution-shift

SCALE: Scalable Cross-Attention Learning with Extrapolation for Agentic Workflow Scheduling

arXiv cs.LG ↗ · 2026-06-08 Cached

This paper proposes SCALE, a deep reinforcement learning scheduler for agentic LLM workflow DAGs that generalizes to unseen cluster sizes using cross-attention and structured representation regularization, reducing response time without retraining.

0 favorites 0 likes

#distribution-shift

When Offline Selectors Cannot Beat the Best Single Model: A Diagnostic Study on edX Dropout Prediction

arXiv cs.LG ↗ · 2026-06-04 Cached

This paper proposes a three-stage diagnostic framework to identify why offline model selectors fail to beat the best single model, applying it to dropout prediction on edX clickstream data. The study finds that the bottleneck is local representational ambiguity rather than learner choice or distribution shift, recommending state redesign or new data collection over further algorithm tuning.

0 favorites 0 likes

#distribution-shift

The Reliability Gap in Benchmark Auditing: Distribution Shift and Scale as Failure Modes of Contamination Detection

arXiv cs.AI ↗ · 2026-06-03 Cached

This paper identifies distribution shift and scale constraints as critical failure modes for statistical contamination detection methods in LLM benchmark auditing. Evaluating three paradigms across 27 models reveals only 199 correct outcomes out of 335 evaluations, indicating a systematic reliability gap that prevents these methods from replacing transparent data provenance.

0 favorites 0 likes

#distribution-shift

Regime-Arrival Uncertainty in Generalization Bounds under Distribution Shift

arXiv cs.LG ↗ · 2026-06-03 Cached

This paper introduces a theoretical framework for quantifying deployment risk when training and deployment distributions differ due to latent regime dynamics modeled as a Markov-switching process, providing exact decomposition and finite-sample bounds.

0 favorites 0 likes

#distribution-shift

TASER: Task-Aware Stein Regularisation for Geometry-Driven Robustness

arXiv cs.LG ↗ · 2026-06-01 Cached

Introduces TASER, a training-time regularization framework derived from Langevin Stein operators that encourages geometric compatibility between predictors and data density, improving adversarial robustness and stability on CIFAR-10 without significant clean accuracy degradation.

0 favorites 0 likes

#distribution-shift

From Static Context to Calibrated Interactive RL: Mitigating Distribution Shift in Multi-turn Dialogue with Aligned Simulator

arXiv cs.AI ↗ · 2026-05-27 Cached

This paper theoretically identifies and mitigates context distribution shift in multi-turn dialogue RL, proposing Calibrated Interactive RL that couples interactive RL with simulator alignment to reduce the sim-to-real gap and achieve state-of-the-art performance.

0 favorites 0 likes

distribution-shift

Submit Feedback