ICML 2026 spotlight: Universal Aesthetic Alignment Narrows Artistic Expression \[R]
Summary
This ICML 2026 spotlight position paper identifies a failure mode in image-generation alignment where aesthetic preference optimization overrides explicit user intent, terming it 'reversed alignment' and testing on anti-aesthetic prompts.
Similar Articles
The Slop Paradox: How Synthetic Standardization Erodes Clinical Uncertainty and Cross-Modal Alignment in AI-Rewritten Radiology Reports
This paper measures information degradation in AI-rewritten radiology reports, finding that tasks producing cleaner text for multimodal training cause greater cross-modal alignment loss, a phenomenon termed the 'slop paradox'.
AI slop in Aestethics
A philosophical critique of AI-generated content ('slop') in aesthetics, arguing that quality requires coherent intentionality and style beyond mere automatic generation.
AI art makes me wonder what we actually value in art
The author reflects on how AI-generated art challenges traditional notions of artistic value, questioning whether beauty alone suffices and whether the human intent behind AI-assisted art matters.
Toward understanding and preventing misalignment generalization
OpenAI researchers investigate 'emergent misalignment'—where fine-tuning a model on narrow incorrect behavior causes broadly unethical responses—and discover a 'misaligned persona' feature in GPT-4o's activations that mediates this phenomenon, enabling potential detection and mitigation strategies.
Emergent Misalignment Can Be Induced by Sycophancy and Reversed via Alignment Gating
The paper shows that sycophancy fine-tuning can induce emergent misalignment in language models, and proposes Alignment Gating as a method to reverse it by learning to control internal representations for unsafe responses.