ICML 2026 spotlight: Universal Aesthetic Alignment Narrows Artistic Expression \[R]

Reddit r/MachineLearning Papers

Summary

This ICML 2026 spotlight position paper identifies a failure mode in image-generation alignment where aesthetic preference optimization overrides explicit user intent, terming it 'reversed alignment' and testing on anti-aesthetic prompts.

I wanted to share an ICML 2026 spotlight position paper on a failure mode in image-generation alignment: aesthetic preference optimization may override explicit user intent when the requested output is anti-aesthetic or outside mainstream visual taste. The paper frames this as \*\*reversed alignment\*\*. Instead of the model aligning to the user's stated preference, the output is pulled back toward the model's learned aesthetic prior. We test generation and reward models on prompts asking for blurry, distorted, low-fidelity, negative-emotion, and other anti-aesthetic images. GitHub repo: [https://github.com/weathon/icml2026\_position](https://github.com/weathon/icml2026_position) Paper: [https://arxiv.org/abs/2512.11883](https://arxiv.org/abs/2512.11883) OpenReview: [https://openreview.net/forum?id=1gQ4zc1Q8I](https://openreview.net/forum?id=1gQ4zc1Q8I) I would be interested in feedback on the framing and on evaluation designs for separating prompt understanding from preference override.
Original Article

Similar Articles

AI slop in Aestethics

Reddit r/artificial

A philosophical critique of AI-generated content ('slop') in aesthetics, arguing that quality requires coherent intentionality and style beyond mere automatic generation.

AI art makes me wonder what we actually value in art

Reddit r/ArtificialInteligence

The author reflects on how AI-generated art challenges traditional notions of artistic value, questioning whether beauty alone suffices and whether the human intent behind AI-assisted art matters.

Toward understanding and preventing misalignment generalization

OpenAI Blog

OpenAI researchers investigate 'emergent misalignment'—where fine-tuning a model on narrow incorrect behavior causes broadly unethical responses—and discover a 'misaligned persona' feature in GPT-4o's activations that mediates this phenomenon, enabling potential detection and mitigation strategies.