corruption-test

#corruption-test

Cosine Misleads: Auxiliary Losses Reshape Vision Language Models, Not Their Latents

Hugging Face Daily Papers ↗ · 2026-06-04 Cached

The paper challenges the assumption that cosine alignment between supervised latents and visual targets improves accuracy in vision-language models, finding a strong negative correlation. It introduces PRISM diagnostics revealing that answers are decoded downstream from latents, not within them, and that the auxiliary loss reshapes the language model via shared parameters.

0 favorites 0 likes

corruption-test

Cosine Misleads: Auxiliary Losses Reshape Vision Language Models, Not Their Latents

Submit Feedback