icml

#icml

Interpretability Can Be Actionable

arXiv cs.LG ↗ · 16h ago Cached

This position paper argues that interpretability research should be evaluated based on actionability—the extent to which insights enable concrete decisions and interventions. The authors propose a framework with evaluation criteria aligned with practical outcomes to address the lack of real-world impact in current interpretability work.

0 favorites 0 likes

#icml

A Single Layer to Explain Them All:Understanding Massive Activations in Large Language Models

arXiv cs.CL ↗ · yesterday Cached

This paper identifies the 'Massive Emergence Layer' where extreme activations in LLMs originate and propagate, proposing a method to mitigate their rigidity and improve model performance on tasks like math reasoning and instruction following.

0 favorites 0 likes

#icml

Towards Universal Gene Regulatory Network Inference: Unlocking Generalizable Regulatory Knowledge in Single-cell Foundation Models

arXiv cs.LG ↗ · yesterday Cached

This paper introduces a new paradigm for universal Gene Regulatory Network (GRN) inference using single-cell foundation models, proposing Virtual Value Perturbation and Gradient Trajectory methods to distill regulatory knowledge.

0 favorites 0 likes

#icml

The Safety-Aware Denoiser for Text Diffusion Models

arXiv cs.LG ↗ · yesterday Cached

This paper introduces the Safety-Aware Denoiser (SAD), a framework for integrating safety constraints into text diffusion models during the denoising process. It aims to reduce unsafe generations while preserving quality, addressing a gap in safety research for non-autoregressive models.

0 favorites 0 likes

#icml

Why DDIM Hallucinates More than DDPM: A Theoretical Analysis of Reverse Dynamics

arXiv cs.LG ↗ · 2d ago Cached

This paper provides a theoretical analysis explaining why deterministic DDIM samplers hallucinate more than stochastic DDPM samplers in diffusion models, attributing it to getting stuck in mode-interpolation regions during reverse dynamics.

0 favorites 0 likes

#icml

@probablynotaz9: Solo-author ICML paper alert Ever wanted to post-train your diffusion LLM with good old policy gradients, without havin…

X AI KOLs Following ↗ · 3d ago Cached

This solo-author ICML paper introduces Amortized Group Relative Policy Optimization (AGRPO) to enable effective reinforcement learning post-training for diffusion language models.

0 favorites 0 likes

#icml

Large Vision-Language Models Get Lost in Attention

arXiv cs.AI ↗ · 5d ago Cached

This research paper analyzes the internal mechanics of Large Vision-Language Models (LVLMs) using information theory, revealing that attention mechanisms may be redundant while Feed-Forward Networks drive semantic innovation. The authors demonstrate that replacing learned attention weights with random values can yield comparable performance, suggesting current models 'get lost in attention'.

0 favorites 0 likes

#icml

[ICML 2026] Scores increased and then decreased!! [D]

Reddit r/MachineLearning ↗ · 2026-04-16

A researcher discusses their ICML 2026 paper review experience where a reviewer increased their score during rebuttal but then decreased it again, expressing concern about rejection prospects.

0 favorites 0 likes

icml

Submit Feedback