Tag
This paper studies off-policy evaluation (OPE) when decision subjects (agents) strategically modify their covariates in response to a policy. It proposes a method that uses local disclosure via post-hoc explanations to reveal agents' pre-strategic covariates and construct a doubly robust estimator for policy value.
A paper analyzing AI agent reliability, accepted at ICML 2026, finds that even the latest frontier models (GPT 5.5, Gemini 3.1 Pro, Claude Opus 4.7) show only marginal reliability improvements over earlier versions, with low outcome consistency and persistent issues in agent scaffolding.
This ICML 2026 paper introduces Derivative Informed XC-Loss (DI-Loss), a training approach for machine-learned exchange-correlation functionals that incorporates first and second derivative supervision on the Grassmannian of density matrices. Across four architectures, DI-Loss reduces total-energy MAE by 66% compared to energy and density supervision alone, and improves excited-state predictions in TDDFT calculations.
RT-Lynx proposes using activation sparsity instead of weight sparsity to accelerate diffusion models, achieving up to 1.55× linear-layer speedup while maintaining generation quality, and is accepted at ICML 2026.
This paper proposes a scalable supervised fine-tuning method for training language models to propose research hypotheses across disciplines. It has been accepted by ICML 2026 and the code is open source.
MOOSE-Star presents a 7B model fine-tuned from DeepSeek-R1-Distill-Qwen-7B for scientific hypothesis discovery, along with a dataset of 108K NCBI papers. The model achieves state-of-the-art inspiration retrieval accuracy, outperforming larger models like GPT-5.4 and Gemini-3 Pro.
The Fast Byte Latent Transformer (BLT-D) has been accepted to ICML 2026, introducing a text diffusion method for parallel byte-level decoding to overcome the speed limitations of traditional byte-level language models.